Process for the selection of hiv-1 subtype c Isolates, selected hiv-1 subtype c isolates, their genes and modifications and derivatives thereof

ABSTRACT

The invention provides a process for the selection of HIV-1 subtype (clade) C isolates, selected HIV-1 subtype C isolates, their genes and modifications and derivatives thereof for use in prophylactic and therapeutic vaccines to produce proteins and polypeptides for the purpose of eliciting protection against HIV infection or disease. The process for the selection of HIV subtype isolates comprises the steps of isolating viruses from recently infected subjects; generating a consensus sequence for at least part of at least one HIV gene by identifying the most common codon or amino acid among the isolated viruses; and selecting the isolated virus or viruses with a high sequence identity to the consensus sequence. HIV-1 subtype C isolates, designated Du422, Du 151 and Du 179 (assigned Accession Numbers 01032114, 00072724 and 00072725, respectively, by the European Collection of Cell Cultures) are also provided.

BACKGROUND TO THE INVENTION

[0001] THIS invention relates to a process for the selection of HIV-1subtype (lade) C isolates, selected HIV-1 subtype C isolates, theirgenes and modifications and derivatives thereof for use in prophylacticand therapeutic vaccines to produce proteins and polypeptides for thepurpose of eliciting protection against HIV infection or disease.

[0002] The disease acquired immunodeficiency syndrome (AIDS) is causedby human immunodeficiency virus (HIV). Over 34 million people worldwideare thought to be living with HIV/AIDS, with over 90% of infected peopleliving in developing countries (UNAIDS, 1999). It is estimated that 24million infected people reside in sub-Saharan Africa and that SouthAfrica currently has one of the world's fastest growing HIV-1 epidemics.At the end of 1999, over 22% of pregnant women attending governmentantenatal clinics in South Africa were HIV positive (Department ofHealth, 2000). A preventative vaccine is considered to be the onlyfeasible way to control this epidemic in the long term.

[0003] HIV shows remarkable genetic diversity that has confounded thedevelopment of a vaccine. The molecular basis of variation resides inthe viral enzyme reverse transcriptase which not only introduces anerror every round of replication, but also promotes recombinationbetween viral RNAs. Based on phylogenetic analysis of sequences, HIV hasbeen classified into a number of groups: the M (major group) whichcomprises subtypes A to H and K, the O (outlier group) and the N (non-M,non-O group). Recently recombinant viruses have been more frequentlyidentified and there are a number which have spread significantly andestablished epidemics (circulating recombinant forms or CRF) such assubtype A/G recombinant in West Africa, and CRF A/E recombinant inThailand (Robertson et al., 2000).

[0004] Subtype C predominates in the Southern African region whichincludes Botswana, Zimbabwe, Zambia, Malawi, Mozambique and SouthAfrica. In addition, increasing numbers of subtype C infections arebeing detected in the Southern region of Tanzania. This subtype alsopredominates in Ethiopia and India and is becoming more important inChina.

[0005] A possible further obstacle to vaccine development is that thebiological properties of HIV change as disease progresses. HIV requirestwo receptors to infect cells, the CD4 and co-receptors of which CCR5and CXCR4 are the major co-receptors used by HIV-1 strains. The mostcommonly transmitted phenotype is non-syncytium inducing (NSI),macrophage-tropic viruses that utilise the CCR5 co-receptor for entry(R5 viruses). Langerhans cells in the mucosa are thought to selectivelypick up R5 variants at the portal of entry and transport them to thelymph nodes where they undergo replication and expansion. As theinfection progresses, viruses evolve that have increased replicativecapacity and the ability to grow in T cell lines. Thesesyncytium-inducing (SI) T-tropic viruses use CXCR4 in conjunction withor in preference to CCR5, and in some cases also use other minorco-receptors (Connor et al., 1997, Richman & Bozzette, 1994). HoweverHIV-1 subtype C viruses appear to be unusual in that they do not readilyundergo this phenotypic switch, as R5 viruses are also predominant inpatients with advanced AIDS (Bjorndal et al., 1999, Peeters et al.,1999, Ping et al., 1999, Tscherning et al., 1998, Scarlatti et al.,1997).

SUMMARY OF THE INVENTION

[0006] According to one aspect of the invention a process for theselection of HIV subtype isolates for use in the development ofprophylactic and therapeutic pharmaceutical composition comprises thefollowing steps:

[0007] isolating viruses from recently infected subjects;

[0008] generating a consensus sequence for at least part of at least oneHIV gene by identifying the most common codon or amino acid among theisolated viruses at each position along at least part of the gene; and

[0009] selecting the isolated virus or viruses with a high sequenceidentity to the consensus sequence, a phenotype which is associated withtransmission for the particular HIV subtype.

[0010] The isolated virus may be of the same subtype as a likelychallenge strain.

[0011] The HIV subtype is preferably HIV-1 subtype C.

[0012] For HIV-1 subtype C, the phenotype which is associated withtransmission is typically a virus that utilises the CCR5 co-receptor andis non syncitium inducing (NSI).

[0013] According to another aspect of the invention an HIV-1 subtype Cisolate, designated Du422 and assigned Provisional Accession Number01032114 by the European Collection of Cell Cultures, is provided.

[0014] According to another aspect of the invention an HIV-1 subtype Cisolate, designated Du151 and assigned Accession Number 00072724 by theEuropean Collection of Cell Cultures, is provided.

[0015] According to another aspect of the invention an HIV-1 subtype Cisolate, designated Du179 and assigned Accession Number 00072725 by theEuropean Collection of Cell Cultures, is provided.

[0016] According to another aspect of the invention a molecule isprovided, the molecule having:

[0017] (i) the nucleotide sequence set out in sequence as set out inSequence I.D. No. 1;

[0018] (ii) an RNA sequence corresponding to the nucleotide sequence setout in Sequence I.D. No. 1;

[0019] (iii) a sequence which will hybridise to the nucleotide sequenceset out in Sequence I.D. No. 1 or an RNA sequence corresponding to it,under strict hybridisation conditions;

[0020] (iv) a sequence which is homologous to the nucleotide sequenceset out in Sequence I.D. No. 1 or an RNA sequence corresponding to it;or

[0021] (v) a sequence which is a modification or derivative of thesequence of any one of (i) to (iv).

[0022] The modified sequence is preferably that set out in Sequence I.D.No. 7.

[0023] According to another aspect of the invention a molecule isprovided, the molecule having:

[0024] (i) the nucleotide sequence set out in Sequence I.D. No. 3;

[0025] (ii) an RNA sequence corresponding to the nucleotide sequence setout in Sequence I.D. No. 3;

[0026] (iii) a sequence which will hybridise to the nucleotide sequenceset out in Sequence I.D. No. 3 or an RNA sequence corresponding to it,under strict hybridisation conditions;

[0027] (iv) a sequence which is homologous to the nucleotide sequenceset out in Sequence I.D. No. 3 or an RNA sequence corresponding to it;or

[0028] (v) a sequence which is a modification or derivative of thesequence of any one of (i) to (iv).

[0029] The modified sequence is preferably that set out in Sequence I.D.No. 9.

[0030] According to another aspect of the invention a molecule isprovided, the molecule having:

[0031] (i) the nucleotide sequence set out in Sequence I.D. No. 5;

[0032] (ii) an RNA sequence corresponding to the nucleotide sequence setout in Sequence I.D. No. 5;

[0033] (iii) a sequence which will hybridise to the nucleotide sequenceset out in Sequence I.D. No. 5 or an RNA sequence corresponding to it,under strict hybridisation conditions;

[0034] (iv). a sequence which is homologous to the nucleotide sequenceset out in Sequence I.D. No. 5 or an RNA sequence corresponding to it;or

[0035] (v) a sequence which is a modification or derivative of thesequence of any one of (i) to (iv).

[0036] The modified sequence is preferably that set out in Sequence I.D.No. 11.

[0037] According to another aspect of the invention a molecule isprovided, the molecule having:

[0038] (i) the nucleotide sequence set out in Sequence I.D. No. 13;

[0039] (ii) an RNA sequence corresponding to the nucleotide sequence setout in Sequence I.D. No. 13;

[0040] (iii) a sequence which will hybridise to the nucleotide sequenceset out in Sequence I.D. No. 13 or an RNA sequence corresponding to it,under strict hybridisation conditions;

[0041] (iv) a sequence which is homologous to the nucleotide sequenceset out in Sequence I.D. No. 13 or an RNA sequence corresponding to it;or

[0042] (v) a sequence which is a modification or derivative of thesequence of any one of (i) to (iv).

[0043] The modified sequence preferably has similar or the samemodifications as those set out in Sequence I.D. No. 11 for the env geneof the isolate Du151.

[0044] According to another aspect of the invention a polypeptide isprovided, the polypeptide having:

[0045] (i) the amino acid sequence set out in Sequence I.D. No. 2; or

[0046] (ii) a sequence which is a modification or derivative of theamino acid sequence set out in Sequence I.D. No. 2.

[0047] The modified sequence is preferably that set out in Sequence I.D.No. 8.

[0048] According to another aspect of the invention a polypeptide isprovided, the polypeptide having:

[0049] (i) the amino acid sequence set out in Sequence I.D. No. 4; or

[0050] (ii) a sequence which is a modification or derivative of theamino acid sequence set out in Sequence I.D. No. 4.

[0051] The modified sequence is preferably that set out in Sequence I.D.No. 10.

[0052] According to another aspect of the invention a polypeptide isprovided, the polypeptide having:

[0053] (i) the amino acid sequence set out in Sequence I.D. No. 6; or

[0054] (ii) a sequence which is a modification or derivative of theamino acid sequence set out in Sequence I.D. No. 6.

[0055] The modified sequence is preferably that set out in Sequence I.D.No. 12.

[0056] According to another aspect of the invention a polypeptide isprovided, the polypeptide having:

[0057] (i) the amino acid sequence set out in Sequence I.D. No. 14;

[0058] (ii) a sequence which is a modification or derivative of theamino acid sequence set out in Sequence I.D. No. 14.

[0059] The modified sequence preferably has similar or the samemodifications as those set out in Sequence I.D. No. 12 for the aminoacid sequence of the env gene of the isolate Du151.

[0060] According to another aspect of the invention a consensus aminoacid sequence for the partial gag gene of HIV-1 subtype C is thefollowing: GEKLDKWEKI RLRPGGKKHY MLKHLVWASR ELERFALNPG LLETSEGCKQ⁵⁰IMKQLQPALQ TGTEELRSLY NTVATLYCVH EKIEVRDTKE ALDKIEEEQN¹⁰⁰ KDQQ-CQQKTQQAKAADGG- KVSQNYPIVQ NLQGQMVHQA ISPRTLNAWV¹⁵⁰ EEKAFSP    EVIPMFTALSEGATPQDLNT MLNTVGGHQA AMQMLKDTIN²⁰⁰ EEAAEWDRLH PVHAGPIAPG QMREPRGSDIAGTTSTLQEQ IAWMTSNPPI²⁵⁰ PVGDIYKRWI ILGLNKIVRM YSPVSILDIK QGPKEPFRDYVDRFFKTLRA³⁰⁰ EQATQDVKNW MTD³¹³

[0061] According to another aspect of the invention a consensus aminoacid sequence for the partial pol gene of HIV-1 subtype C is thefollowing: LTEEKIKALT AICEEMEKEG KITKIGPENP YNTPVFAIKK KDSTKWRKL⁵⁰VDFRELNKRT QDFWEVQLGI PHPAGLKKKK SVTVLDVGDA YFSVPLDEGF¹⁰⁰ RKYTAFTIPSINNETPGIRY QYNVLPQGWK GSPAIFQSSM TKILEPFRAK¹⁵⁰ NPEIVIYQYM DDLYVGSDLEIGQHRAKIEE LREHLLKWGF TTPDKKHQKE²⁰⁰ PPFLWMGYEL HPDKWTVQPI QLPEKDSWTVNDIQKLVGKL NWASQIYPGI²⁵⁰ KVRQLCKLLR GAKALTDIVP LTEEAELE²⁷⁸

[0062] According to another aspect of the invention a consensus aminoacid sequence for the partial env gene of HIV-1 subtype C is thefollowing: YCAPAGYAIL KCNNKTFNGT GPCNNVSTVQ CTHGIKPVVS TQLLLNGSLA⁵⁰EEEIIIRSEN LTNNAKTIIV HLNESVEIVC TRPNNNTRKS IRIGPGOTFY¹⁰⁰ ATGDIIGDIRQAHCNISEGK WNKTLQKVKK KLKEELYKYK VVEIKPLGIA¹⁵⁰ PTEAKRRVVE REKRAVGIGAVFLGFLGAAG STMGAASITL TVQARQLLSG²⁰⁰ IVQQQSNLLR AIEAQQHMLQ LTVWGIKQL²²⁹

DESCRIPTION OF THE DRAWINGS

[0063]FIG. 1 shows a schematic representation of the HIV-1 genome andillustrates the location of overlapping fragments that were sequencedhaving been generated by reverse transcriptase followed by polymerasechain reaction, in order to generate the South African consensussequence;

[0064]FIG. 2 shows a phylogenetic tree of nucleic acid sequences ofvarious HIV-1 subtype C isolates based on the (partial) sequences of thegag gene of the various isolates and includes a number of consensussequences as well as the South African consensus sequence of the presentinvention and a selected isolate, Du422, of the present invention;

[0065]FIG. 3 shows a phylogenetic tree of nucleic acid sequences ofvarious HIV-1 subtype C isolates based on the (partial) sequences of thepol gene of the various isolates and includes a number of consensussequences as well as the South African consensus sequence of the presentinvention and a selected isolate, Du152, of the present invention;

[0066]FIG. 4 shows a phylogenetic tree of nucleic acid sequences ofvarious HIV-1 subtype C isolates based on the (partial) sequences of theenv gene of the various isolates and includes a number of consensussequences as well as the South African consensus sequence of the presentinvention and a selected isolate, Du151, of the present invention

[0067]FIG. 5 shows how the sequences of the gag genes of each of anumber of isolates varies from the South African consensus sequence forthe gag gene which was developed according to the present invention;

[0068]FIG. 6 shows how the sequences of the pol genes of each of anumber of isolates varies from the South African consensus sequence forthe pol gene which was developed according to the present invention;

[0069]FIG. 7 shows how the sequences of the env genes of each of anumber of isolates varies from the South African consensus sequence forthe env gene which was developed according to the present invention;

[0070]FIG. 8 shows a phylogenetic tree of amino acid sequences ofvarious HIV-1 subtype C isolates based on the sequences of the (partial)gag gene of the various isolates and includes a number of consensussequences as well as the South African consensus sequence of the presentinvention and a selected isolate, Du422, of the present invention;

[0071]FIG. 9 shows a phylogenetic tree of amino acid sequences ofvarious HIV-1 subtype C isolates based on the sequences of the (partial)pol gene of the various isolates and includes a Cpol consensus sequenceas well as a South African consensus sequence of the present inventionand a selected isolate, Du151, of the present invention;

[0072]FIG. 10 shows a phylogenetic tree of amino acid sequences ofvarious HIV-1 subtype C isolates based on the sequences of the (partial)env gene of the various isolates and includes a Cenv consensus sequenceas well as a South African consensus sequence of the present inventionand a selected isolate, Du151, of the present invention;

[0073]FIG. 11 shows the percentage amino acid sequence identity of thesequenced gag genes of the various isolates in relation to one another,to the gag clone and to the South African consensus sequence for the gaggene and is based on a pairwise comparison of the gag genes of theisolates;

[0074]FIG. 12 shows the percentage amino acid sequence identity of thesequenced pol genes of the various isolates in relation to one another,to the pol clone and to the South African consensus sequence for the polgene and is based on a pairwise comparison of the pol genes of theisolates;

[0075]FIG. 13 shows the percentage amino acid sequence identity of thesequenced env genes of the various isolates in relation to one another,to the env clone and to the South African consensus sequence for the envgene and is based on a pairnvise comparison of the env genes of theisolates;

[0076]FIG. 14 shows a phylogenetic tree analysis of nucleic acidsequences of various HIV-1 subtype C isolates (or vaccine strains) basedon the complete sequences of the gag genes of the various isolates andshows the gag gene from a selected isolate, Du422, of the presentinvention compared to the other subtype C sequences;

[0077]FIG. 15 shows a phylogenetic tree analysis of nucleic acidsequences of various HIV-1 subtype C isolates (or vaccine strains) basedon the complete sequences of the pol genes of the various isolates andshows the pol gene from a selected isolate, Du151, of the presentinvention compared to the other subtype C sequences;

[0078]FIG. 16 shows a phylogenetic tree analysis of nucleic acidsequences of various HIV-1 subtype C isolates (or vaccine strains) basedon the complete sequences of the env gene of the various isolates andshows the env gene from a selected isolate, Du151, of the presentinvention compared to the other subtype C sequences; and

LIST OF SEQUENCES

[0079] Sequence I.D. No 1 shows the nucleic acid sequence (cDNA) of thesequenced gag gene of the isolate Du422;

[0080] Sequence I.D. No 2 shows the amino acid sequence of the sequencedgag gene of the isolate Du422, derived from the nucleic acid sequence;

[0081] Sequence I.D. No 3 shows the nucleic acid sequence (cDNA) of thesequenced pol gene of the isolate Du151;

[0082] Sequence I.D. No 4 shows the amino acid sequence of the sequencedpol gene of the isolate Du151, derived from the nucleic acid sequence;

[0083] Sequence I.D. No 5 shows the nucleic acid sequence (cDNA) of thesequenced env gene of the isolate Du151;

[0084] Sequence I.D. No 6 shows the amino acid sequence of the sequencedenv gene of the isolate Du151, derived from the nucleic acid sequence;

[0085] Sequence I.D. No 7 shows the nucleic acid sequence (DNA) of theresynthesized sequenced gag gene of the isolate Du422 modified toreflect human codon usage for the purposes of increased expression;

[0086] Sequence I.D. No 8 shows the amino acid sequence of theresynthesized sequenced gag gene of the isolate Du422 modified toreflect human codon usage for the purposes of increased expression;Sequence I.D. No 9 shows the nucleic acid sequence (DNA) of theresynthesized sequenced pol gene of the isolate Du151 modified toreflect human codon usage for the purposes of increased expression;

[0087] Sequence I.D. No 10 shows the amino acid sequence of theresynthesized sequenced pol gene of the isolate Du151 modified toreflect human codon usage for the purposes of increased expression;

[0088] Sequence I.D. No 11 shows the nucleic acid sequence (DNA) of theresynthesized sequenced env gene of the isolate Du151 modified toreflect human codon usage for the purposes of increased expression;

[0089] Sequence I.D. No 12 shows the amino acid sequence of theresynthesized sequenced env gene of the isolate Du151 modified toreflect human codon usage for the purposes of increased expression;

[0090] Sequence I.D. No 13 shows the nucleic acid sequence (cDNA) of thesequenced env gene of the isolate Du179; and

[0091] Sequence I.D. No 14 shows the amino acid sequence of thesequenced env gene of the isolate Du179.

DETAILED DESCRIPTION OF THE INVENTION

[0092] This invention relates to the selection of HIV-1 subtype isolatesand the use of their genes and modifications and derivatives thereof inmaking prophylactic and therapeutic pharmaceutical compositions andformulations, and in particular vaccines against HIV-1 subtype C. Thecompositions could therefore be used either prophylactically to preventinfection or therapeutically to prevent or modify disease. A number offactors must be taken into consideration in the development of an HIVvaccine and one aspect of the present invention relates to a process forthe selection of suitable HIV isolates for the development of a vaccine.

[0093] The applicant envisages that the vaccine developed according tothe above method could be used against one or more HIV subtypes otherthan HIV-1 subtype C.

[0094] An HIV vaccine aims to elicit both a CD8+ cytotoxic T lymphocyte(CTL) immune response as well as a neutralizing antibody response. Manycurrent vaccine approaches have primarily focused on inducing a CTLresponse. It is thought that the CTL response may be more important asit is associated with the initial control of viral replication afterinfection, as well as control of replication during disease, and isinversely correlated with disease progression (Koup et al., 1994, Ogg etal., 1999 Schmitz et al., 1999). The importance of CTL in protectingindividuals from infection is demonstrated by their presence in highlyexposed seronegative individuals such as sex-workers (Rowland-Jones etal., 1998).

[0095] Knowledge of genetic diversity is highly relevant to the designof vaccines aiming at eliciting a cytotoxic T-lymphocyte (CTL) response.There are many CTL epitopes in common between viruses, particularly inthe gag and pol region of the genome (HIV Molecular Immunology Database,1998). In addition, several studies have now shown that there is across-reactive CTL response: individuals vaccinated with a subtypeB-based vaccine could lyse autologous targets infected with a diversegroup of isolates (Ferrari et al., 1997); and CTLs from non-B infectedindividuals could lyse subtype B-primed targets (Betts et al. 1997;Durali et al, 1998). A comparison of CTL epitopes in the HIV-1 sequencedatabase shows about 40% of gp41 and 84% of p24 epitopes are identicalor have only one amino acid difference between subtypes. Although thisis a very crude analysis and does not take into considerationpopulations or dominant responses to certain epitopes, it does howeverindicate that there is a greater conservation of cytotoxic T epitopeswithin a subtype compared to between subtypes and that there will be agreater chance of a CTL response if the challenge virus is the samesubtype as the vaccine strain.

[0096] The importance of genetic diversity in inducing a neutralizingantibody response appears to be less crucial. In general, neutralizationserotypes are not related to genetic subtype. Some individuals elicitantibodies that can neutralize a broad range of viruses, includingviruses of different subtypes while others fail to elicit effectiveneutralizing antibodies at all (Wyatt and Sodroski, 1998; Kostrikis etal., 1996; Moore et al., 1996). As neutralizing antibodies are largelyevoked against functional domains of the virus which are essentiallyconserved, it is probable that HIV-1 genetic diversity may not berelevant in producing a vaccine designed to elicit neutralizingantibodies. Viral strains used in the design of a vaccine need to beshown by genotypic analysis to be representative of the circulatingstrains and not an unusual or outlier strain. In addition, it isimportant that a vaccine strain also has the phenotype of a recentlytransmitted virus, which is NSI and uses the CCR5 co-receptor.

[0097] A process was developed to identify appropriate strains for usein developing a vaccine for HIV-1 subtype C. Viral isolates from acutelyinfected individuals were collected. They were sequenced in the env, gagand pol regions and the amino acid sequences for the env, gag and polgenes from these isolates were compared. A consensus sequence, the SouthAfrican consensus sequence, was then formed by selecting the mostfrequently appearing amino acid at each position. The consensus sequencefor each of the gag, pol and env genes of HIV-1 subtype C also forms anaspect of the invention. Appropriate strains for vaccine developmentwere then selected from these isolates by comparing them with theconsensus sequence and characterising them phenotypically. The isolatesalso form an aspect of the invention.

[0098] In order to select for NSI strains which use the CCR5co-receptor, a well established sex worker cohort was used to identifythe appropriate strains. Appropriate strains were identified fromacutely infected individuals by comparing them with the consensussequence which had been formed. Viral isolates from fifteen acutelyinfected individuals were sequenced in the env, gag and pol andphenotypically characterised. These sequences were compared with viralisolates from fifteen asymptomatic individuals from another regionhaving more than 500 CD4 cells and other published subtype C sequenceslocated in the Los Alamos Database (http)://ww.hiv-web.lanl.gov/).

[0099] Three potential vaccine strains, designated Du151, Du422 andDu179, were selected. Du 151 and Du 422 were selected based on aminoacid homology to the consensus sequence in all three gene regions env,gag and pol, CCR5 tropism and ability to grow and replicate in tissueculture. Du 179 is a R5X4 virus and was selected because the patient inwhich this strain was found showed a high level of neutralisingantibodies. The nucleotide and amino acid sequences of the three generegions of the three isolates and modifications and derivatives thereofalso form aspects of the invention.

[0100] The vaccines of the invention will be formulated in a number ofdifferent ways using a variety of different vectors. They involveencapsulating RNA or transcribed DNA sequences from the viruses in avariety of different vectors. The vaccines will contain at least part ofthe gag gene from the Du422 isolate, and at least part of the pol andenv genes from the Du151 isolate of the present invention and/or atleast part of the env gene from the Du179 isolate of the presentinvention or derivatives or modifications thereof.

[0101] Genes for use in DNA vaccines have been resynthesized to reflecthuman codon usage. The gag Du422 gene was designed so that themyristylation site and inhibitory sequences were removed. Similarlyresynthesized gp 160 (the complete env gene consisting of gp 120 and gp41) and pol genes will be expressed by DNA vaccines. The gp160 genesequence has also been changed as described above for the gag gene toreflect human codon usage and the rev responsive element removed. Theprotease, inactivated reverse transcriptase and start of the RNAse Hgenes from Du151 pol are optimised for increased expression and will bejoined with gag at an inserted Bgl1 site. The gag-pol frameshift will bemaintained to keep the natural balance of gag to pol protein expression.

[0102] Another vaccine will contain DNA transcribed from the RNA for thegag gene from the Du422 isolate and RNA from the pol and env genes fromthe Du151 isolate and/or RNA from the env gene from the Du179 isolate.These genes could also be expressed as oligomeric envelope glycoproteincomplexes (Progenics, USA) as published in J Virol 2000 Jan;74(2):627-43(Binley, J. L. et al.), the adeno associated virus (AAV) (TargetGenetics) and the Venezuelan equine encephalitus virus (U.S. patentapplication U.S. S No. 60/216,995, which is incorporated herein byreference).

[0103] The Isolation and Selection of Viral Strains for the Design of aVaccine

[0104] The following criteria were used to select appropriate strainsfor inclusion into HIV-1 vaccines for Southern Africa:

[0105] that the strains be genotypically representative of circulatingstrains;

[0106] that the strain not be an outlier strain;

[0107] that the strain be as close as possible to the consensus aminoacid sequence developed according to the invention for the env, gag andpol genes of HIV-1 subtype C;

[0108] that the strain have an R5 phenotype, i.e. a phenotype associatedwith transmission for selection of the RNA or cDNA to be included forthe env region; and

[0109] that the vaccine be able to be grown in tissue culture.

[0110] The following procedure was followed in the selection of viralstrains for the design of a vaccine. A well-established sex workercohort in Kwazulu Natal, South Africa was used to identify theappropriate strains for use in an HIV vaccine. Viral isolates from 15acutely infected individuals were sequenced in env, gag and pol and werealso isolated and phenotypically characterised. These sequences werecompared with a similar collection from asymptomatic individuals fromthe Gauteng region in South Africa as well as other published subtype Csequences.

[0111] Patients

[0112] Individuals with HIV infection were recruited from 4 regions inSouth Africa. Blood samples were obtained from recently infected sexworkers from Kwazulu-Natal (n=13). Recent infection was defined asindividuals who were previously seronegative and had became seropositivewithin the previous year. Samples were also collected from individualsattending out-patients clinics in Cape Town (n=2), women attendingante-natal clinics in Johannesburg (n=7) and men attending a STD clinicon a gold mine outside Johannesburg (n=8). The latter 2 groups wereclinically stable and were classified as asymptomatic infections. Bloodsamples were collected in EDTA and used to determine the CD4 T cellcount and genetic analysis of the virus. In the case of recentinfections a branched chain (bDNA) assay (Chiron) to measure plasmaviral load was done, and the virus was isolated. HIV-1 serostatus wasdetermined by ELISA. The results of the CD4 T cell counts and the viralloads on the sex workers were established and information on theclinical status as at date of seroconversion, CD4, and data on theco-receptor usage is set out in Table 1 below.

[0113] Virus Isolation

[0114] HIV was isolated from peripheral blood mononuclear cells (PBMC)using standard co-culture techniques with mitogen-activated donor PBMC.2×10⁶ patient PBMC were co-cultured with 2×10⁶ donor PBMC in 12 wellplates with 2 ml RPMI 1640 with 20% FCS, antibiotics and 5% IL-2(Boehringer). Cultures were replenished twice weekly with fresh mediumcontaining IL-2 and once with 5×10⁵/ml donor PBMC. Virus growth wasmonitored weekly using a commercial p24 antigen assay (Coulter). Antigenpositive cultures were expanded and cultured for a further 2 weeks toobtain 40 mis of virus containing supernatant which was stored at −70°C. until use. The results of the isolation of the viruses from thecommercial sex workers is also shown in Table 1 below.

[0115] Viral Phenotypes

[0116] Virus-containing supernatant was used to assess the biologicalphenotype of viral isolates on MT-2 and co-receptor transfected celllines. For the MT-2 assay, 500 ul of supernatant was incubated with5×10⁴ MT-2 cells in PRMI plus 10% FCS and antibiotics. Cultures weremonitored daily for syncitia formation over 6 days. U87.CD4 cellexpressing either the CCR5 or CXCR4 co-receptor were grown in DMEM with10% FCS, antibiotics, 500 ug/ml G418 and 1 ug/ml puromycin . GHOST cellsexpressing minor co-receptors were grown in DMEM with 10% FCS, 500 ug/mlG418, 1 ug/ml puromycin and 100 ug/ml hygromycin. Cell lines werepassaged twice weekly by trypsination. Co-receptor assays were done in12 well plates; 5×10⁴ cells were plated in each well and allowed toadhere overnight. The following day 500 ul of virus containingsupematant was added and incubated overnight to allow viral attachmentand infection and washed three times the following day. Cultures weremonitored on days 4, 8 and 12 for syncitia formation and p24 antigenproduction. Cultures that showed evidence of syncitia and increasingconcentrations of p24 antigen were considered positive for viral growth.The results of co-receptor usage of the viruses from the commercial sexworkers is also shown in Table 1. TABLE 1 COHORT OF ACUTE INFECTIONS FORSELECTION OF VACCINE CANDIDATES Duration of Co-culture p24 Sample IDSero date Sample date infection CD4 count Viral load pos MT-2 assayBiotype Du115 15 May 1998 20 May 1999 1 year 437*    7,597* — No isolate— Du123 17 Aug. 1998 17 Nov. 1998 3 mon 841   19,331 d6 (50 pg) NSI R5Du151 12 Oct. 1998 24 Nov. 1998 1.5 mon 367   >500.000 d6 (>1 ng) NSI R5Du156 16 Nov. 1998 17 Nov. 1998 <1 mon 404      22.122 d6 (>1 ng) NSI R5Du172 16 Oct. 1998 17 Nov. 1998 1 mon 793      1.916 d6 (<50 pg) NSI R5Du174 6 Oct. 97 25 May 1999 19.5 mon 634*    9,454* d14 (>1 ng) NSI R5Du179 13 Aug. 97 20 May 1999 21 mon 394*    1,359* d7 (<50 pg) SI R5x4Du204 20 May 1998 20 May 1999 1 year 633*      8.734* d7 (<50 pg) NSI R5Du258 3 Jun. 1998 22 Jun. 1999 1 year 433*    9,114* — No isolate —Du281 24 Jul. 1998 17 Nov. 1998 4 mon 594      24.689 d6 (1 ng) NSI R5Du285 2 Oct. 1998 — — 560*     161* — No isolate — Du368 8 Apr. 1998 24Nov. 1998 7.5 mon 670   13,993 d6 (300 pg) NSI R5 Du422 2 Oct. 1998 28Jan. 1999 4 mon 397   17,118* d6 (600 pg) NSI R5 Du457 17 Aug. 1998 17Nov. 1998 3 mon 665    6,658 — No isolate — Du467 26 Aug. 1998 — — 671     19.268 — No isolate —

[0117] Sequencing

[0118] RNA was isolated from plasma and the gene fragments wereamplified from RNA using reverse transcriptase to generate a cDNAfollowed by PCR to generate amplified DNA segments. The positions of thePCR primers are as follows, with the second of each primer pair beingused as the reverse transcriptase primer in the cDNA synthesis step(numbering using the HIV-1 HXBr sequence): gag1 (790-813, 1282-1303),gag2 (1232-1253, 1797-1820), pol1 (2546-2573, 3012-3041),pol2(2932-2957, 3492-3515), env1 (6815-6838, 7322-7349), env2(7626-7653, 7963-7986). The amplified DNA fragments were purified usingthe QIAQUICK PCR Purification Kit (Qiagen, Germany). The DNA fragmentswere then sequenced using the upstream PCR primers as sequencingprimers. Sequencing was done using the Sanger dideoxyterminator strategywith fluorescent dyes attached to the dideoxynucleotides. The sequencedetermination was made by electrophoresis using an ABI 377 Sequencer. Amapped illustration of an HIV-1 proviral genome showing the pol, gag andenv regions sequenced as described above, is shown in FIG. 1. Thefollowing regions were sequenced (numbering according to HXBr, LosAlamos database); 813-1282 (gagl); 1253-1797 (gag2); 2583-3012 (pol1);2957-3515 (pol2); 6938-7322 (env1); 7653-7963 (env2), as illustrated inFIG. 1.

[0119] Genotypic Characterisation

[0120] To select the vaccine isolate or isolates, a survey coveringportions of the three major HIV genes gag (313 contiguous codons, 939bases), pol (278 contiguous codons, 834 bases) and env (229 codons intwo noncontigous segments, 687 bases) was done (FIG. 1). The map of FIG.1 shows the 5′long terminal repeat, the structural and functional genes(gag, pol and env) as well as the regulatory and accessory proteins(vif, tat, rev, nef, vpr and vpu). The gag open reading frameillustrates the regions encoding p17 matrix protein and the p24 coreprotein and the p7 and p6 nuclearcapsid proteins. The pol open readingframe illustrates the protease (PR) p15, reverse transcriptase (RT) p66and the Rnase H integrase p51. The env open reading frame indicates theregion coding for gp120 and the region coding for gp41.

[0121] Of a total of 31 isolates, 14 were from the Durban cohort (DU),15 were from Johannesburg (GG and RB) and 2 from Cape Town (CT). Ofthese 30 were sequenced in the gag region, 26 in the pol region and 27in the env region. The isolates that were sequenced are shown in Table2. TABLE 2 LIST OF ISOLATES AND THE REGIONS GENES SEQUENCED Isolate Gagsequence Pol sequence Env sequence CTSC1 ✓ ✓ — CTSC2 ✓ ✓ — DU115 ✓ ✓ ✓DU123 ✓ — ✓ DU151 — ✓ ✓ DU156 ✓ ✓ ✓ DU172 ✓ ✓ ✓ DU174 ✓ ✓ ✓ DU179 ✓ ✓ ✓DU204 ✓ ✓ ✓ DU258 ✓ ✓ ✓ DU281 ✓ — ✓ DU368 ✓ ✓ ✓ DU422 ✓ ✓ ✓ DU457 ✓ ✓ ✓DU467 ✓ — ✓ GG1 ✓ — — GG10 ✓ ✓ ✓ GG3 ✓ ✓ ✓ GG4 ✓ ✓ ✓ GG5 ✓ ✓ ✓ GG6 ✓ ✓ ✓RB12 ✓ — ✓ RB13 ✓ ✓ ✓ RB14 ✓ ✓ ✓ RB15 ✓ ✓ — RB18 ✓ ✓ ✓ RB21 ✓ ✓ ✓ RB22 ✓✓ ✓ RB27 ✓ ✓ ✓ RB28 ✓ ✓ ✓

[0122] The nucleic acid sequences from the Durban (DU) Johannesburg (GG,RB) and Cape Town (CT) cohorts were phylogenetically compared to allavailable published subtype C sequences (obtained from the Los AlamosHIV Sequence Database) including sequences from the other southernAfrican countries and the overall subtype C consensus from the LosAlamos HIV sequence database. This comparison was done to ensure thatthe selected vaccine isolates were not phylogenetic outliers whencompared to the Southern African sequences and the results of thecomparison are shown in FIG. 2, FIG. 3 and FIG. 4. FIGS. 2 to 4illustrate that the sequences from Southern Africa are divergent andthat the Indian sequences form a separate distinct cluster from theseAfrican sequences. The South African sequences are not unique and, ingeneral, are as related to each other as they are to other sequencesfrom Southern Africa. Overall this suggests Indian sequences are uniquefrom Southern African subtype C sequences and that we do not have aclonal epidemic in South Africa, but rather South African virusesreflect the diversity of subtype C viruses in the Southern Africanregion

[0123] Determination of a Consensus Sequence

[0124] Amino acid sequences were derived from the sequences shown inTable 2 and were used to determine a South African consensus sequence.The most frequently appearing amino acid at each position was selectedas the consensus amino acid at that position. In this way, the consensussequence was determined along the linear length of each of the sequencedgene fragments (gag, pol and env gene fragments). The alignments weredone using the Genetics Computer Group (GCG) programs (Pileup andPretty), which generates a consensus sequence in this manner. Theseresulted in the consensus sequence for each gene region. The alignmentsof the amino acid sequences and the resulting consensus sequences areshown in FIGS. 5, 6 and 7.

[0125] The phylogenetic tree of amino acids showing a comparison of theSouth African sequences is set out in FIGS. 8, 9 and 10. The ES2 gag S,which is the sequence of the cloned Du422 gag gene, Du151 pol (clonenumber) 8, which is the sequence of the cloned Du151 pol gene, and Du151env (clone number) 25, which is the sequence of the cloned Du151 envgene, are vaccine clones. It can be seen from FIGS. 8, 9 and 10 thatthey are the same as the original isolates. These phylogenetic treescompare the relationship between the HIV proteins. South Africanisolates were compared with subtype A, B, C and D consensus sequences aswell as with the South African consensus (Sagagcon) derived from theSouth African sequences, a Malawian consensus (Malgagcon) derived fromMalawian sequences and overall consensuses (Cgagcon, Cpolcon andCenvcon) derived from all subtype C sequences on the Los Alamosdatabase.

[0126] The final choice of which isolate or isolates to use was based onthe similarity of the sequence of the gag, env and pol genes of aparticular isolate to the South African consensus sequence which hadbeen derived as set out above as well as the availability of an R5isolate which had good replication kinetics as shown in Table 1.

[0127] Selection of Vaccine Isolates

[0128] Based on the considerations and methodology set out above, threestrains were selected from the acute infection cohort as the vaccinestrains. The first strain is Du422 for the gag gene, the second strainis Du151 for the pol and env genes and the third strain is Du179 whichis a possible alternative for the env gene. These three strains wereselected for the following reasons.

[0129] 1. At the time the samples were obtained, Du151 had been infectedfor 6 weeks and had a CD4 count of 367 cells per ul of blood and a viralload above 500,000 copies per ml of plasma. Given the high viral load,and the recorded time from infection, it is probable that the individualwas still in the initial stages of viraemia prior to control of HIVreplication by the immune system.

[0130] 2. At the time the samples were obtained, Du422 had been infectedfor 4 months with a CD4 count of 397 cells per ul of blood and a viralload of 17,118 copies per ml of plasma. In contrast to Du151 thisindividual had already brought viral replication under control to acertain extent.

[0131] 3. At the time the samples were obtained, Du179 had been infectedfor 21 months with a CD4 count of 394 cells per ul of blood and a viralload of 1,359 copies per ml of plasma.

[0132] Based on the analysis of the phylogenetic tree shown in FIG. 8showing the relationship between full length gp120 sequence and otherisolates, and the amino acid pairwise comparison shown in FIG. 11, theDu422 gag sequence was shown to be most similar to the South Africanconsensus sequence shown in FIGS. 2 and 5. It shared 98% amino acidsequence identity with the consensus sequence. In addition, the averagepairwise distance, which is the percentage difference between the DNAsequences, between the DU422 gag sequence and the other sequences fromthe seroconverters was the highest of any sequence derived from thiscohort, at 93.5%, and nearly as high as the average distance of theisolates to the SA consensus sequence (94.2%). The Du422 gag gene wascloned and the specific clone gave values very similar to the originalisolate: having a pairwise identity value with the SA consensus of (98%)and nearly as high an average identity value with the other isolates asthe DU422 isolate (93.3%). Thus, both the original DU422 isolatesequence and the generated clone had the highest pairwise percentagesimilarity to other isolates with the minimal values all being above90%.

[0133] The pol sequences showed the highest values for the pairwisecomparisons. Based on the analysis of the phylogenetic tree shown inFIG. 9 and the pairwise identity score with the SA consensus (98.9%)shown in FIG. 12, we chose the DU151 isolate as the source of the polgene. Other contributing factors in this decision were that this is thesame isolate that was chosen for the source of the env gene and thatthis was an isolate with excellent growth properties in vitro. Theactual pol gene clone from the DU151 isolate was somewhat more divergentfrom the SA consensus sequence (97.8%), and had a smaller averageidentity score when compared to the other isolates (95.1%). However, wejudged the small increase in distance from the consensus not to besignificant in this otherwise well conserved HIV-1 gene and thereforechose the DU151 pol gene for further development. Only one of the recentseroconverter sequences was less than 93% identical with the DU151 polgene segment.

[0134] The env gene showed the greatest sequence diversity. Based on theanalysis of the phylogenetic tree shown in FIG. 10, we chose the DU151env gene. The DU151 env gene segment shows an average pairwisecomparison score with the other isolates of 87.2%, with the clone beingslightly higher (87.9%). The DU151 isolate gene segment has a pairwiseidentity score of 92.6% with the SA consensus while the DU151 clone isat 91.3%. Finally, all pairwise identity scores are above 83% witheither the DU151 isolate sequence or the clone when compared to theother recent seroconverters, as shown in FIG. 13. These pairwise scoresmake the DU151 sequence similar to the best scores in this sequence pooland combine these levels of similarity with an R5 virus with good cellculture replication kinetics.

[0135] The clones representing the full length gene for each of theabove viral genes were generated by PCR. Viral DNA present in cellsinfected with the individual isolates were used for the pol and envclones, and DNA derived directly from plasma by RT-PCR was used for thegag clone. Total DNA was extracted from the infected cell pellets usingthe QIAGEN DNeasy Tissue Kit. This DNA was used in PCR reactions usingthe following primers (HXBR numbering, Los Alamos database) in a nestedPCR amplification strategy:

[0136] gag: outer,623-640, and 2391-2408. inner, 789-810 and 2330-2350

[0137] pol: outer,2050-2073, and 5119-5148. inner,2085 -2108, and5068-5094.

[0138] env: outer, 6195-6218, and 8807-8830. inner, 6225-6245, and8758-8795.

[0139] The PCR products were blunt-end cloned into pT7Blue using theNovagen pT7Blue Blunt Kit. The inserts were characterized by doingcolony PCR to identify clones with gene inserts. The identity of theinsert was confirmed by sequencing the insert on both strands andcomparing this sequence to the original sequence.

[0140] Modification of Clones

[0141] Several modifications were introduced to the cloned genes, asshown in FIGS. 23 to 28. In order to increase levels of expression ofproteins, the DNA sequence was resynthesized and the followingmodifications were made:

[0142] the codon usage was changed to reflect human codon usage forincreased expression; and

[0143] the inhibitory and rev responsive elements were also removed.

[0144] The modifications to the gag gene sequence of Du422 are shown inSequence I.D. numbers 7 and 8.

[0145] Also for the DNA, modified vaccinia ankara (MVA) and BCGvaccines, the pol gene was truncated so that only the protease, reversetranscriptase and RNAse H regions of the pol gene will be expressed. Inaddition, the active site amino acid motive YMDD has been mutated to YMMso that the expressed reverse transcriptase will be catalyticallyinactive. The modifications to the pol gene of Du151 are shown insequence I.D. numbers 9 and 10.

[0146] Synthetic Genes

[0147] The complete gag and env genes were resynthesized to optimise thecodons for expression in human cells, also shown in Sequence I.D.numbers 9 to 12. During this process the inhibitory sequences (INS) andrev responsive elements (RRE) are removed which has reported to resultin increased expression. The gag gene myristylation signal was mutatedas described above and as shown in Sequence I.D. numbers 7 and 8.

[0148] The following material has been deposited with the EuropeanCollection of Cell Cultures, Centre for Applied Microbiology andResearch, Salisbury, Wiltshire SP4 OJG, United Kingdom (ECACC).

[0149] Deposits Material ECACC Deposit No. Deposit Date HIV-1 Viralisolate Du151 Accession Number 27 Jul. 2000 00072724 HIV-1 Viral isolateDu179 Accession Number 27 Jul. 2000 00072725 HIV-1 Viral isolate Du422Provisional Accession 27 Jul. 2000 Number 00072726 Provisional Accession22 Mar. 2001 Number 01032114

[0150] The deposit was made under the provisions of the Budapest Treatyon the International Recognition of the Deposit of Microorganisms forthe Purpose of Patent Procedure and regulations thereunder (BudapestTreaty).

References

[0151] UNAIDS. AIDS epidemic update. December 1999.www.unaids.org/hivaidsinfo/documents.html

[0152] Binley J M, Sanders R W, Clas B, Schuelke N, Master A, Guo Y,Kajumo F, Anselma D J, Maddon P J, Olson W C, Moore J P., J Virol 2000Jan;74(2):627-43

[0153] Bjorndal, A., Sonnerborg, A., Tscherning, C., Albert, J. & Fenyo,E. M. (1999). Phenotypic characteristics of human immunodeficiency virustype 1 subtype C isolates of Ethiopian.

[0154] Connor, R., Sheridan, K., Ceraldini, D., Choe, S. & Landau, N.(1997). Changes in co-receptor use correlates with disease progressionin HIV-1-infected individuals. J Exp Med 185, 621-628.

[0155] Durali D, Morvan J, Letourneur F, Schmitt D, Guegan N, Dalod M,Saragosti S, Sicard D, Levy J P & Gomard E (1998). Cross-reactionsbetween the cytotoxic T-lymphocyte responses of human immunodeficiencyvirus-infected African and European patients. J Virol 72:3547-53.

[0156] Ferrari G, Humphrey W, McElrath M J, Excler J L, Duliege A M,Clements M L, Corey L C, Bolognesi D P & Weinhold K J (1997). CladeB-based HIV-1 vaccines elicit cross-clade cytotoxic T lymphocytereactivities in uninfected volunteers. Proc Natl Acad Sci U S A18;94(4): 139-6401.

[0157] HIV Molecular Immunology Database 1998: Korber B, Brander C, KoupR, Walker B, Haynes B, & Moore J, Eds. Theoretical Biology andBiophysics Group, Los Alamos National Laboratory, Los Alamos, N.M.

[0158] Kostrikis, L. G., Cao, Y., Ngai, H., Moore, J. P. & Ho, D. D(1996). Quantitative analysis of serum neutralization of humanimmunodeficiency virus type 1 from subtypes A, B. C, D, E, F, and I:lack of direct correlation between neutralization serotypes and geneticsubtypes and evidence for prevalent serum-dependent infectivityenhancement. J. Virol. 70, 445-458.

[0159] Koup R A, Safrit J T, Cao Y, Andrews C A, McLeod G, Borkowsky W,Farthing C, Ho D D (1994). Temporal association of cellular immuneresponses with the initial control of viremia in primary humanimmunodeficiency virus type 1 syndrome. J Virol. 68(7):4650-5.

[0160] Moore J P, Cao Y, Leu J, Qin L, Korber B & Ho D D (1996). Inter-and intraclade neutralization of human imunodeficiency virus type 1:genetic clades do not correspond to neutralization serotypes butpartially correspond to gp120 antigenic serotypes. J. Virol. 70,427-444.

[0161] Ogg G S, Kostense S, Klein M R, Jurriaans S, Hamann D, McMichaelA J & Miedema F (1999). Longitudinal phenotypic analysis of humanimmunodeficiency virus type 1-specific cytotoxic T lymphocytes:correlation with disease progression. J Virol; 73(11):9153-60.

[0162] Peeters, M., Vincent, R., Perret, J.-L., Lasky, M., Patrel, D.,Liegeois, F., Courgnaud, V., Seng, R., Matton, T., Molinier, S. &Delaporte, E. (1999). Evidence for differences in MT2 cell tropismaccording to genetic subtypes of HIV-1: syncitium-inducing variants seemrare among subtype C HIV-1 viruses. J Acquir Imm Def Synd 20, 115-121.

[0163] Richman, D. & Bozzette, S. (1994). The impact of thesyncytium-inducing phenotype of human immunodeficiency virus on diseaseprogression. J Inf Dis 169, 968-974.

[0164] Robertson D L, Anderson J P, Bradac J A, Carr J K, Foley B,Funkhouser R K, Gao R, Hahn B H, Kalish M L, Kuiken C, Learn G H LeitnerT, McCutchan F, Osmanov S, Peeters M, Pieniazek D, Salminen M, Sharp PM, Wolinsky S, Korber B (2000). HIV nomenclature proposal. Science 7;288(5463):55-6.

[0165] Rowland-Jones S L, Dong T, Fowke K R, Kimani J, Krausa P, NewellH, Blanchard T, Ariyoshi K, Oyugi J, Ngugi E, Bwayo J, MacDonald K S,McMichael A J & Plummer F A (1998). Cytotoxic T-cell responses tomultiple conserved epitopes in HIV-resistant prostitutes in Nairobi. J.Clin. Invest. 102 (9): 1758-1765.

[0166] Scarlatti, G., Tresoldi, E., Bjorndal, A., Fredriksson, R.,Colognesi, C., Deng, H., Malnati, M., Plebani, A., Siccardi, A.,Littman, D., Fenyo, E. & Lusso, P. (1997). In vivo evolution of HIV-1co-receptor usage and sensitivity to chemokine-mediated suppression. NatMed 3, 1259-1265.

[0167] Schmitz J E, Kuroda M J, Santra S, Sasseville V G, Simon M A,Lifton M A, Racz P, Tenner-Racz K, Dalesandro M, Scallon B J, Ghrayeb J,Forman M A, Montefiori D C, Rieber E P, Letvin N L, Reimann K A (1999).Control of viremia in simian immunodeficiency virus infection by CD8+lymphocytes. Science 5;283(5403):857-60.

[0168] Summary Report: National HIV sero-prevalence survey of womenattending public antenatal clinics in South Africa, 1999 (2000).Department of Health, Directorate: Health Systems Research &Epidemiology, April 2000.

[0169] Tscheming, C., Alaeus, A., Fredriksson, R., Bjorndal, A., Deng,H., Littman, D., Fenyo, E. M. & Alberts, J. (1998). Differences inchemokine co-receptor usage between genetic subtypes of HIV-1. Virology241, 181-188.

[0170] Wyatt R and Sodroski J (1998). The HIV-1 envelope glycoproteins:Fusogens, antigens and immunogens. Science, 280 (5371):1884-8.

[0171] Wyatt R, Kwong, Desjardins E, Sweet R W, Robinson J, HendricksonW A & Sodroski J G (1998). The antigenic structure of the HIV gp120envelope glycoprotein. Nature, 393(6686):705-11.

1 32 1 1479 DNA Human immunodeficiency virus type 1 1 atgggtgcgagagcgtcaat attaagaggg gaaaaattag ataaatggga aaaaattagg 60 ttaaggccagggggaaagaa acattatatg ttaaaacaca tagtatgggc aagcagggag 120 ctggaaagatttgcacttaa ccctggcctt ttagaaacat cagaaggatg taaacaaata 180 atgaaacagctacaaccagc tctccagaca ggaacagagg aacttaaatc attatacaac 240 acagtagcaactctctattg tgtacatgaa aagatagaag tacgagacac caaggaagcc 300 ttagataagatagaggaaga acaaaacaaa tgtcagcaaa aaacgcagca ggcaaaagcg 360 gctgacgggaaagtcagtca aaattatcct atagtgcaga atctccaagg gcaaatggta 420 catcaagccatatcacctag aaccttgaat gcatgggtaa aagtaataga agaaaaggct 480 tttagcccagaggtaatacc catgtttaca gcattatcag aaggagccac cccacaagat 540 ttaaacaccatgttaaatac agtgggggga catcaagcag ccatgcaaat gttaaaagat 600 actattaatgaagaggctgc agaatgggat agagtacatc cagtccatgc ggggcctatt 660 gcaccaggccagatgagaga accaagggga agtgacatag caggaactac tagtaccctt 720 caggaacaaatagcatggat gacaagtaac ccacctattc cagtgggaga catctataaa 780 agatggataattctggggtt aaataaaata gtgagaatgt atagcccggt cagcattttg 840 gacataagacaagggccaaa ggaacccttt cgagactatg tagatcggtt ctttaaaact 900 ttaagagctgaacaagctac acaagaagta aaaaattgga tgacagacac cttgttagtc 960 caaaatgcgaacccagattg taagaccatt ttgagagcat taggaccagg ggctacatta 1020 gaagaaatgatgacagcatg tcaaggggtg ggaggacctg gtcacaaagc aagagtattg 1080 gctgaggcaatgagtcaagc aaacagtgga aacataatga tgcagagaag caattttaaa 1140 ggccctagaagaattgttaa atgttttaac tgtggcaagg aagggcacat agccagaaat 1200 tgcagagcccctaggaaaaa aggctgttgg aaatgtggaa aggaaggaca ccaaatgaaa 1260 gactgtactgaaaggcaggc taatttttta gggaaaattt ggccttccca caaggggagg 1320 ccagggaatttccttcagaa cagaccagag ccaacagccc caccagcaga gagcttcagg 1380 ttcgaagagacaacccccgc tccgaaacag gagccgatag aaagggaacc cttaacttcc 1440 ctcaaatcactctttggcag cgaccccttg tctcaataa 1479 2 492 PRT Human immunodeficiencyvirus type 1 2 Met Gly Ala Arg Ala Ser Ile Leu Arg Gly Glu Lys Leu AspLys Trp 1 5 10 15 Glu Lys Ile Arg Leu Arg Pro Gly Gly Lys Lys His TyrMet Leu Lys 20 25 30 His Ile Val Trp Ala Ser Arg Glu Leu Glu Arg Phe AlaLeu Asn Pro 35 40 45 Gly Leu Leu Glu Thr Ser Glu Gly Cys Lys Gln Ile MetLys Gln Leu 50 55 60 Gln Pro Ala Leu Gln Thr Gly Thr Glu Glu Leu Lys SerLeu Tyr Asn 65 70 75 80 Thr Val Ala Thr Leu Tyr Cys Val His Glu Lys IleGlu Val Arg Asp 85 90 95 Thr Lys Glu Ala Leu Asp Lys Ile Glu Glu Glu GlnAsn Lys Cys Gln 100 105 110 Gln Lys Thr Gln Gln Ala Lys Ala Ala Asp GlyLys Val Ser Gln Asn 115 120 125 Tyr Pro Ile Val Gln Asn Leu Gln Gly GlnMet Val His Gln Ala Ile 130 135 140 Ser Pro Arg Thr Leu Asn Ala Trp ValLys Val Ile Glu Glu Lys Ala 145 150 155 160 Phe Ser Pro Glu Val Ile ProMet Phe Thr Ala Leu Ser Glu Gly Ala 165 170 175 Thr Pro Gln Asp Leu AsnThr Met Leu Asn Thr Val Gly Gly His Gln 180 185 190 Ala Ala Met Gln MetLeu Lys Asp Thr Ile Asn Glu Glu Ala Ala Glu 195 200 205 Trp Asp Arg ValHis Pro Val His Ala Gly Pro Ile Ala Pro Gly Gln 210 215 220 Met Arg GluPro Arg Gly Ser Asp Ile Ala Gly Thr Thr Ser Thr Leu 225 230 235 240 GlnGlu Gln Ile Ala Trp Met Thr Ser Asn Pro Pro Ile Pro Val Gly 245 250 255Asp Ile Tyr Lys Arg Trp Ile Ile Leu Gly Leu Asn Lys Ile Val Arg 260 265270 Met Tyr Ser Pro Val Ser Ile Leu Asp Ile Arg Gln Gly Pro Lys Glu 275280 285 Pro Phe Arg Asp Tyr Val Asp Arg Phe Phe Lys Thr Leu Arg Ala Glu290 295 300 Gln Ala Thr Gln Glu Val Lys Asn Trp Met Thr Asp Thr Leu LeuVal 305 310 315 320 Gln Asn Ala Asn Pro Asp Cys Lys Thr Ile Leu Arg AlaLeu Gly Pro 325 330 335 Gly Ala Thr Leu Glu Glu Met Met Thr Ala Cys GlnGly Val Gly Gly 340 345 350 Pro Gly His Lys Ala Arg Val Leu Ala Glu AlaMet Ser Gln Ala Asn 355 360 365 Ser Gly Asn Ile Met Met Gln Arg Ser AsnPhe Lys Gly Pro Arg Arg 370 375 380 Ile Val Lys Cys Phe Asn Cys Gly LysGlu Gly His Ile Ala Arg Asn 385 390 395 400 Cys Arg Ala Pro Arg Lys LysGly Cys Trp Lys Cys Gly Lys Glu Gly 405 410 415 His Gln Met Lys Asp CysThr Glu Arg Gln Ala Asn Phe Leu Gly Lys 420 425 430 Ile Trp Pro Ser HisLys Gly Arg Pro Gly Asn Phe Leu Gln Asn Arg 435 440 445 Pro Glu Pro ThrAla Pro Pro Ala Glu Ser Phe Arg Phe Glu Glu Thr 450 455 460 Thr Pro AlaPro Lys Gln Glu Pro Ile Glu Arg Glu Pro Leu Thr Ser 465 470 475 480 LeuLys Ser Leu Phe Gly Ser Asp Pro Leu Ser Gln 485 490 3 2997 DNA Humanimmunodeficiency virus type 1 3 tttagggaaa atttggcctt cccacaaggggaggccaggg aatttccttc agaacagacc 60 agagccaaca gccccaccag cagagagcttcaggttcgaa gaaacaaccc ccgctccgaa 120 acaggagccg agagaaaggg aacccttaacttccctcaaa tcactctttg gcagcgaccc 180 cttgtctcaa taaaaatagg gggccagacaagggaggctc tcttagacac aggagcagat 240 gatacagtat tagaagacat aaatttgccaggaaaatgga aaccaaaaat gataggagga 300 attggaggtt ttatcaaagt aagacagtatgatcaaatac ttatagaaat ttgtggaaaa 360 aaggctatag gtacagtatt agtagggcctacacctgtca acataattgg cagaaacatg 420 ttgactcagc ttggatgcac actaaactttccaatcagtc ccattgaaac tgtaccagta 480 aaactgaagc caggaatgga tggcccaaaggttaaacaat ggccgttaac agaagagaaa 540 ataaaagcat taacagcaat ttgtgaagaaatggaaaagg aaggaaaaat tacaaaaatt 600 gggcctgaaa atccatataa cactccaatatttgccataa aaaagaaaga cagcactaag 660 tggagaaaat tagtagattt cagggaactcaataaaagaa ctcaagactt ttgggaggtt 720 caattaggaa taccacaccc agcagggttaaaaaagaaaa aatcagtgac agtactggat 780 gtgggagatg catatttttc agttcctttagatgaaggct tcaggaaata tactgcattc 840 accataccta gtataaacaa tgaaacaccagggattagat atcaatataa tgtgcttcca 900 caaggatgga aagggtcacc agcaatattccagggtagca tgacaaaaat cttagagccc 960 tttagagctc aaaatccaga aatagtcatctatcaatata tggatgactt gtatgtagga 1020 tctgacttag aaatagggca acatagagcaaaaatagaag agttaagaga acatctatta 1080 aagtggggat ttaccacacc agacaaaaaacatcagaaag aacccccatt tctttggatg 1140 gggtatgaac tccatcctga caaatggacagtacagccta tacagctgcc agaaaaggat 1200 agctggactg tcaatgatat acagaagttagtgggaaaat taaactgggc aagtcagatt 1260 tacccaggga ttaaagtaag gcaactttgtaagctcctta gggggaccaa agcactaaca 1320 gacatagtac cactaactga agaagcagaattagaattgg cagagaacag ggaaattcta 1380 aaagaaccag tgcatggagt atattatgacccatcaaaag acttgatagc tgaaatacag 1440 aaacaggggg atgaccaatg gacatatcaaatttaccaag aaccattcaa aaacctgaag 1500 acaggaaagt atgcaaaaag gaggactacccacactaatg atgtaaaaca gttaacagag 1560 gcagtgcaaa aaatatcctt ggaaagcatagtaatatggg gaaagactcc taaatttaga 1620 ctacccatcc aaaaagaaac atgggaaatatggtggacag actattggca agccacatgg 1680 attcctgagt gggagtttgt taatacccctcccctagtaa aactatggta ccagctagaa 1740 aaagaaccca tagcaggagc agaaactttctatgtagatg gagcagctaa tagggaaact 1800 aaaataggaa aagcggggta tgttactgacagaggaaggc agaaaattgt aactctaagt 1860 gaaacaacaa atcagaagac tgaattacaagcaattcagc tagctttgca agattcagaa 1920 tcagaagtaa acataataac agactcacagtacgcattag gaatcattca agcacaacca 1980 gataggagtg aatcagagtt ggtcaatcaaataatagaac aattaataaa aaaggaaagg 2040 gtctatctgt catgggtacc agcacacaacggacttgcag gaaatgaaca tgtagataaa 2100 ttagtaagta ggggaatcag gaaagtgctggttctagatg gaatagataa ggctcatgaa 2160 gagcatgaaa agtatcacag caattggagagcaatggcta gtgagtttaa tctgccaccc 2220 gtagtagcaa gagaaatagt agccagctgtgataaatgtc agctaaaagg ggaagccata 2280 catggacaag tagattgtag tccggggatatggcaattag attgtacaca tttagaagga 2340 aaaatcatcc tggtagcagt ccatgtagccagtggctaca tagaagcaga ggttatccca 2400 gcagaaacag gacaagaaac agcatactatatactaaaat tagcaggaag atggccagtc 2460 aaagtaatac atacagacaa tggcagtaatttcaccagtg ctgcagttaa ggcagcctgt 2520 tggtgggcag gtatccaaca ggaatttgggattccctaca atccccaaag tcagggagta 2580 gtagaatcca tgaataaaga attaaagaaaatcatagggc aggtaagaga tcaagctgag 2640 caccttaaga cagcagtaca aatggcagtattcattcaca attttaaaag aaaagggggg 2700 attggggggt acagtgcagg ggaaagaataatagacataa tagcaacaga catacaaact 2760 aaagaattac aaaaacaaat tataaaaattcaaaattttc gggtttatta cagagacagc 2820 agagatccta tttggaaagg accagccaagctactctgga aaggtgaagg ggcagtagta 2880 atacaagaca acagtgacat aaaggtagtaccaaggagga aagtaaaaat cattagggac 2940 tatggaaaac agatggcagg tgctgattgtgtggcaggta gacaggatga agattag 2997 4 998 PRT Human immunodeficiencyvirus type 1 4 Phe Arg Glu Asn Leu Ala Phe Pro Gln Gly Glu Ala Arg GluPhe Pro 1 5 10 15 Ser Glu Gln Thr Arg Ala Asn Ser Pro Thr Ser Arg GluLeu Gln Val 20 25 30 Arg Arg Asn Asn Pro Arg Ser Glu Thr Gly Ala Glu ArgLys Gly Thr 35 40 45 Leu Asn Phe Pro Gln Ile Thr Leu Trp Gln Arg Pro LeuVal Ser Ile 50 55 60 Lys Ile Gly Gly Gln Thr Arg Glu Ala Leu Leu Asp ThrGly Ala Asp 65 70 75 80 Asp Thr Val Leu Glu Asp Ile Asn Leu Pro Gly LysTrp Lys Pro Lys 85 90 95 Met Ile Gly Gly Ile Gly Gly Phe Ile Lys Val ArgGln Tyr Asp Gln 100 105 110 Ile Leu Ile Glu Ile Cys Gly Lys Lys Ala IleGly Thr Val Leu Val 115 120 125 Gly Pro Thr Pro Val Asn Ile Ile Gly ArgAsn Met Leu Thr Gln Leu 130 135 140 Gly Cys Thr Leu Asn Phe Pro Ile SerPro Ile Glu Thr Val Pro Val 145 150 155 160 Lys Leu Lys Pro Gly Met AspGly Pro Lys Val Lys Gln Trp Pro Leu 165 170 175 Thr Glu Glu Lys Ile LysAla Leu Thr Ala Ile Cys Glu Glu Met Glu 180 185 190 Lys Glu Gly Lys IleThr Lys Ile Gly Pro Glu Asn Pro Tyr Asn Thr 195 200 205 Pro Ile Phe AlaIle Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu 210 215 220 Val Asp PheArg Glu Leu Asn Lys Arg Thr Gln Asp Phe Trp Glu Val 225 230 235 240 GlnLeu Gly Ile Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val 245 250 255Thr Val Leu Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu 260 265270 Gly Phe Arg Lys Tyr Thr Ala Phe Thr Ile Pro Ser Ile Asn Asn Glu 275280 285 Thr Pro Gly Ile Arg Tyr Gln Tyr Asn Val Leu Pro Gln Gly Trp Lys290 295 300 Gly Ser Pro Ala Ile Phe Gln Gly Ser Met Thr Lys Ile Leu GluPro 305 310 315 320 Phe Arg Ala Gln Asn Pro Glu Ile Val Ile Tyr Gln TyrMet Asp Asp 325 330 335 Leu Tyr Val Gly Ser Asp Leu Glu Ile Gly Gln HisArg Ala Lys Ile 340 345 350 Glu Glu Leu Arg Glu His Leu Leu Lys Trp GlyPhe Thr Thr Pro Asp 355 360 365 Lys Lys His Gln Lys Glu Pro Pro Phe LeuTrp Met Gly Tyr Glu Leu 370 375 380 His Pro Asp Lys Trp Thr Val Gln ProIle Gln Leu Pro Glu Lys Asp 385 390 395 400 Ser Trp Thr Val Asn Asp IleGln Lys Leu Val Gly Lys Leu Asn Trp 405 410 415 Ala Ser Gln Ile Tyr ProGly Ile Lys Val Arg Gln Leu Cys Lys Leu 420 425 430 Leu Arg Gly Thr LysAla Leu Thr Asp Ile Val Pro Leu Thr Glu Glu 435 440 445 Ala Glu Leu GluLeu Ala Glu Asn Arg Glu Ile Leu Lys Glu Pro Val 450 455 460 His Gly ValTyr Tyr Asp Pro Ser Lys Asp Leu Ile Ala Glu Ile Gln 465 470 475 480 LysGln Gly Asp Asp Gln Trp Thr Tyr Gln Ile Tyr Gln Glu Pro Phe 485 490 495Lys Asn Leu Lys Thr Gly Lys Tyr Ala Lys Arg Arg Thr Thr His Thr 500 505510 Asn Asp Val Lys Gln Leu Thr Glu Ala Val Gln Lys Ile Ser Leu Glu 515520 525 Ser Ile Val Ile Trp Gly Lys Thr Pro Lys Phe Arg Leu Pro Ile Gln530 535 540 Lys Glu Thr Trp Glu Ile Trp Trp Thr Asp Tyr Trp Gln Ala ThrTrp 545 550 555 560 Ile Pro Glu Trp Glu Phe Val Asn Thr Pro Pro Leu ValLys Leu Trp 565 570 575 Tyr Gln Leu Glu Lys Glu Pro Ile Ala Gly Ala GluThr Phe Tyr Val 580 585 590 Asp Gly Ala Ala Asn Arg Glu Thr Lys Ile GlyLys Ala Gly Tyr Val 595 600 605 Thr Asp Arg Gly Arg Gln Lys Ile Val ThrLeu Ser Glu Thr Thr Asn 610 615 620 Gln Lys Thr Glu Leu Gln Ala Ile GlnLeu Ala Leu Gln Asp Ser Glu 625 630 635 640 Ser Glu Val Asn Ile Ile ThrAsp Ser Gln Tyr Ala Leu Gly Ile Ile 645 650 655 Gln Ala Gln Pro Asp ArgSer Glu Ser Glu Leu Val Asn Gln Ile Ile 660 665 670 Glu Gln Leu Ile LysLys Glu Arg Val Tyr Leu Ser Trp Val Pro Ala 675 680 685 His Asn Gly LeuAla Gly Asn Glu His Val Asp Lys Leu Val Ser Arg 690 695 700 Gly Ile ArgLys Val Leu Val Leu Asp Gly Ile Asp Lys Ala His Glu 705 710 715 720 GluHis Glu Lys Tyr His Ser Asn Trp Arg Ala Met Ala Ser Glu Phe 725 730 735Asn Leu Pro Pro Val Val Ala Arg Glu Ile Val Ala Ser Cys Asp Lys 740 745750 Cys Gln Leu Lys Gly Glu Ala Ile His Gly Gln Val Asp Cys Ser Pro 755760 765 Gly Ile Trp Gln Leu Asp Cys Thr His Leu Glu Gly Lys Ile Ile Leu770 775 780 Val Ala Val His Val Ala Ser Gly Tyr Ile Glu Ala Glu Val IlePro 785 790 795 800 Ala Glu Thr Gly Gln Glu Thr Ala Tyr Tyr Ile Leu LysLeu Ala Gly 805 810 815 Arg Trp Pro Val Lys Val Ile His Thr Asp Asn GlySer Asn Phe Thr 820 825 830 Ser Ala Ala Val Lys Ala Ala Cys Trp Trp AlaGly Ile Gln Gln Glu 835 840 845 Phe Gly Ile Pro Tyr Asn Pro Gln Ser GlnGly Val Val Glu Ser Met 850 855 860 Asn Lys Glu Leu Lys Lys Ile Ile GlyGln Val Arg Asp Gln Ala Glu 865 870 875 880 His Leu Lys Thr Ala Val GlnMet Ala Val Phe Ile His Asn Phe Lys 885 890 895 Arg Lys Gly Gly Ile GlyGly Tyr Ser Ala Gly Glu Arg Ile Ile Asp 900 905 910 Ile Ile Ala Thr AspIle Gln Thr Lys Glu Leu Gln Lys Gln Ile Ile 915 920 925 Lys Ile Gln AsnPhe Arg Val Tyr Tyr Arg Asp Ser Arg Asp Pro Ile 930 935 940 Trp Lys GlyPro Ala Lys Leu Leu Trp Lys Gly Glu Gly Ala Val Val 945 950 955 960 IleGln Asp Asn Ser Asp Ile Lys Val Val Pro Arg Arg Lys Val Lys 965 970 975Ile Ile Arg Asp Tyr Gly Lys Gln Met Ala Gly Ala Asp Cys Val Ala 980 985990 Gly Arg Gln Asp Glu Asp 995 5 2535 DNA Human immunodeficiency virustype 1 5 atgagagtga tggggataca gaggaattgg ccacaatggt ggatatggggcaccttaggc 60 ttttggatga taataatttg tagggtggtg gggaacttga acttgtgggtcacagtctat 120 tatggggtac ctgtgtggaa agaagcaaaa actactctat tctgtgcatcagatgctaaa 180 gcatatgata aagaagtaca taatgtctgg gctacacatg cctgtgtacccacagacccc 240 aacccacgag aaatagtttt ggaaaatgta acagaaaatt ttaacatgtggaaaaatgac 300 atggtggatc agatgcatga ggatataatc agtttatggg atcaaagcctaaaaccatgt 360 gtaaagttga ccccactctg tgtcacttta aattgtacaa atgcacctgcctacaataat 420 agcatgcatg gagaaatgaa aaattgctct ttcaatacaa ccacagagataagagatagg 480 aaacagaaag cgtatgcact tttttataaa cctgatgtag tgccacttaataggagagaa 540 gagaataatg ggacaggaga gtatatatta ataaattgca attcctcaaccataacacaa 600 gcctgtccaa aggtcacttt tgacccaatt cctatacatt attgtgctccagctggttat 660 gcgattctaa agtgtaataa taagacattc aatgggacag gaccatgcaataatgtcagc 720 acagtacaat gtacacatgg aattatgcca gtggtatcaa ctcaattactgttaaatggt 780 agcctagcag aagaagagat aataattaga tctgaaaatc tgacaaacaatatcaaaaca 840 ataatagtcc accttaataa atctgtagaa attgtgtgta caagacccaacaataataca 900 agaaaaagta taaggatagg accaggacaa acattctatg caacaggtgaaataatagga 960 aacataagag aagcacattg taacattagt aaaagtaact ggaccagtactttagaacag 1020 gtaaagaaaa aattaaaaga acactacaat aagacaatag aatttaacccaccctcagga 1080 ggggatctag aagttacaac acatagcttt aattgtagag gagaatttttctattgcaat 1140 acaacaaaac tgttttcaaa caacagtgat tcaaacaacg aaaccatcacactcccatgc 1200 aagataaaac aaattataaa catgtggcag aaggtaggac gagcaatgtatgcccctccc 1260 attgaaggaa acataacatg taaatcaaat atcacaggac tactattgacacgtgatgga 1320 ggaaagaata caacaaatga gatattcaga ccgggaggag gaaatatgaaggacaattgg 1380 agaagtgaat tatataaata taaagtggta gaaattgagc cattgggagtagcacccact 1440 aaatcaaaaa ggagagtggt ggagagagaa aaaagagcag tgggactaggagctgtactc 1500 cttgggttct tgggagcagc aggaagcact atgggcgcgg cgtcaataacgctgacggta 1560 caggccagac aactgttgtc tggtatagtg caacagcaaa gcaatttgctgagagctata 1620 gaggcgcaac agcatatgtt gcaactcacg gtctggggca ttaagcagctccagacaaga 1680 gtcttggcta tagagagata cctaaaggat caacagctcc tagggctttggggctgctct 1740 ggaaaaatca tctgcaccac tgctgtgcct tggaactcca gttggagtaataaatctcaa 1800 gaagatattt gggataacat gacctggatg cagtgggata gagaaattagtaattacaca 1860 ggcacaatat ataggttact tgaagactcg caaaaccagc aggagaaaaatgaaaaagat 1920 ttattagcat tggacagttg gaaaaacttg tggaattggt ttaacataacaaattggctg 1980 tggtatataa aaatattcat catgatagta ggaggcttga taggtttgagaataattttt 2040 ggtgtactcg ctatagtgaa aagagttagg cagggatact cacctttgtcgtttcagacc 2100 cttaccccaa gcccgagggg tcccgacagg ctcggaagaa tcgaagaagaaggtggagag 2160 caagacaaag acagatccat tcgattagtg agcggattct tagcacttgcctgggacgat 2220 ctgcggagcc tgtgcctctt cagctaccac cacttgagag acttcatattgattgcagcg 2280 agagcagcgg aacttctggg acgcagcagt ctcaggggac tgcagagagggtgggaagcc 2340 cttaagtatc tgggaaatct tgtgcagtat gggggtctgg agctaaaaagaagtgctatt 2400 aaactgtttg ataccatagc aatagcagta gctgaaggaa cagataggattcttgaagta 2460 atacagagaa tttgtagagc tatccgccac atacctataa gaataagacagggctttgaa 2520 gcagctttgc aataa 2535 6 844 PRT Human immunodeficiencyvirus type 1 6 Met Arg Val Met Gly Ile Gln Arg Asn Trp Pro Gln Trp TrpIle Trp 1 5 10 15 Gly Thr Leu Gly Phe Trp Met Ile Ile Ile Cys Arg ValVal Gly Asn 20 25 30 Leu Asn Leu Trp Val Thr Val Tyr Tyr Gly Val Pro ValTrp Lys Glu 35 40 45 Ala Lys Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys AlaTyr Asp Lys 50 55 60 Glu Val His Asn Val Trp Ala Thr His Ala Cys Val ProThr Asp Pro 65 70 75 80 Asn Pro Arg Glu Ile Val Leu Glu Asn Val Thr GluAsn Phe Asn Met 85 90 95 Trp Lys Asn Asp Met Val Asp Gln Met His Glu AspIle Ile Ser Leu 100 105 110 Trp Asp Gln Ser Leu Lys Pro Cys Val Lys LeuThr Pro Leu Cys Val 115 120 125 Thr Leu Asn Cys Thr Asn Ala Pro Ala TyrAsn Asn Ser Met His Gly 130 135 140 Glu Met Lys Asn Cys Ser Phe Asn ThrThr Thr Glu Ile Arg Asp Arg 145 150 155 160 Lys Gln Lys Ala Tyr Ala LeuPhe Tyr Lys Pro Asp Val Val Pro Leu 165 170 175 Asn Arg Arg Glu Glu AsnAsn Gly Thr Gly Glu Tyr Ile Leu Ile Asn 180 185 190 Cys Asn Ser Ser ThrIle Thr Gln Ala Cys Pro Lys Val Thr Phe Asp 195 200 205 Pro Ile Pro IleHis Tyr Cys Ala Pro Ala Gly Tyr Ala Ile Leu Lys 210 215 220 Cys Asn AsnLys Thr Phe Asn Gly Thr Gly Pro Cys Asn Asn Val Ser 225 230 235 240 ThrVal Gln Cys Thr His Gly Ile Met Pro Val Val Ser Thr Gln Leu 245 250 255Leu Leu Asn Gly Ser Leu Ala Glu Glu Glu Ile Ile Ile Arg Ser Glu 260 265270 Asn Leu Thr Asn Asn Ile Lys Thr Ile Ile Val His Leu Asn Lys Ser 275280 285 Val Glu Ile Val Cys Thr Arg Pro Asn Asn Asn Thr Arg Lys Ser Ile290 295 300 Arg Ile Gly Pro Gly Gln Thr Phe Tyr Ala Thr Gly Glu Ile IleGly 305 310 315 320 Asn Ile Arg Glu Ala His Cys Asn Ile Ser Lys Ser AsnTrp Thr Ser 325 330 335 Thr Leu Glu Gln Val Lys Lys Lys Leu Lys Glu HisTyr Asn Lys Thr 340 345 350 Ile Glu Phe Asn Pro Pro Ser Gly Gly Asp LeuGlu Val Thr Thr His 355 360 365 Ser Phe Asn Cys Arg Gly Glu Phe Phe TyrCys Asn Thr Thr Lys Leu 370 375 380 Phe Ser Asn Asn Ser Asp Ser Asn AsnGlu Thr Ile Thr Leu Pro Cys 385 390 395 400 Lys Ile Lys Gln Ile Ile AsnMet Trp Gln Lys Val Gly Arg Ala Met 405 410 415 Tyr Ala Pro Pro Ile GluGly Asn Ile Thr Cys Lys Ser Asn Ile Thr 420 425 430 Gly Leu Leu Leu ThrArg Asp Gly Gly Lys Asn Thr Thr Asn Glu Ile 435 440 445 Phe Arg Pro GlyGly Gly Asn Met Lys Asp Asn Trp Arg Ser Glu Leu 450 455 460 Tyr Lys TyrLys Val Val Glu Ile Glu Pro Leu Gly Val Ala Pro Thr 465 470 475 480 LysSer Lys Arg Arg Val Val Glu Arg Glu Lys Arg Ala Val Gly Leu 485 490 495Gly Ala Val Leu Leu Gly Phe Leu Gly Ala Ala Gly Ser Thr Met Gly 500 505510 Ala Ala Ser Ile Thr Leu Thr Val Gln Ala Arg Gln Leu Leu Ser Gly 515520 525 Ile Val Gln Gln Gln Ser Asn Leu Leu Arg Ala Ile Glu Ala Gln Gln530 535 540 His Met Leu Gln Leu Thr Val Trp Gly Ile Lys Gln Leu Gln ThrArg 545 550 555 560 Val Leu Ala Ile Glu Arg Tyr Leu Lys Asp Gln Gln LeuLeu Gly Leu 565 570 575 Trp Gly Cys Ser Gly Lys Ile Ile Cys Thr Thr AlaVal Pro Trp Asn 580 585 590 Ser Ser Trp Ser Asn Lys Ser Gln Glu Asp IleTrp Asp Asn Met Thr 595 600 605 Trp Met Gln Trp Asp Arg Glu Ile Ser AsnTyr Thr Gly Thr Ile Tyr 610 615 620 Arg Leu Leu Glu Asp Ser Gln Asn GlnGln Glu Lys Asn Glu Lys Asp 625 630 635 640 Leu Leu Ala Leu Asp Ser TrpLys Asn Leu Trp Asn Trp Phe Asn Ile 645 650 655 Thr Asn Trp Leu Trp TyrIle Lys Ile Phe Ile Met Ile Val Gly Gly 660 665 670 Leu Ile Gly Leu ArgIle Ile Phe Gly Val Leu Ala Ile Val Lys Arg 675 680 685 Val Arg Gln GlyTyr Ser Pro Leu Ser Phe Gln Thr Leu Thr Pro Ser 690 695 700 Pro Arg GlyPro Asp Arg Leu Gly Arg Ile Glu Glu Glu Gly Gly Glu 705 710 715 720 GlnAsp Lys Asp Arg Ser Ile Arg Leu Val Ser Gly Phe Leu Ala Leu 725 730 735Ala Trp Asp Asp Leu Arg Ser Leu Cys Leu Phe Ser Tyr His His Leu 740 745750 Arg Asp Phe Ile Leu Ile Ala Ala Arg Ala Ala Glu Leu Leu Gly Arg 755760 765 Ser Ser Leu Arg Gly Leu Gln Arg Gly Trp Glu Ala Leu Lys Tyr Leu770 775 780 Gly Asn Leu Val Gln Tyr Gly Gly Leu Glu Leu Lys Arg Ser AlaIle 785 790 795 800 Lys Leu Phe Asp Thr Ile Ala Ile Ala Val Ala Glu GlyThr Asp Arg 805 810 815 Ile Leu Glu Val Ile Gln Arg Ile Cys Arg Ala IleArg His Ile Pro 820 825 830 Ile Arg Ile Arg Gln Gly Phe Glu Ala Ala LeuGln 835 840 7 1905 DNA Human immunodeficiency virus type 1 7 gggggatgtgctgcaaggcg attaagttgg gtaacgccag ggttttccca atcacgacgt 60 tgtaaaacgacagccaatga attgaagctt atggctgctc gcgcatctat cctcagaggc 120 gaaaagttggataagtggga aaaaatcaga ctcaggccag gaggtaaaaa acactacatg 180 ctgaagcatatcgtgtgggc atctagggag ttggagagat ttgcactgaa ccccggactg 240 ctggaaacctcagagggctg taagcaaatc atgaaacagc tccaaccagc cttgcagacc 300 ggaacagaagagctgaagtc cctttacaat accgtggcaa ccctctattg cgtccacgag 360 aagatcgaggtgagagacac aaaggaggcc ctggacaaaa tcgaggagga gcagaataag 420 tgccagcagaagacccagca ggcaaaggct gctgacggaa aggtctctca gaactatcct 480 atcgttcagaaccttcaggg gcagatggtg caccaagcaa tcagccctag aaccctgaac 540 gcatgggtgaaggtgatcga ggagaaagcc ttttctcccg aggttatccc catgtttacc 600 gccctgagcgaaggcgccac tcctcaagac ctgaacacta tgctgaacac agtgggagga 660 caccaggccgctatgcagat gttgaaggat accatcaacg aggaggcagc cgaatgggac 720 cgcctccaccccgtgcacgc cggacctatc gcccccggac aaatgagaga acctcgcgga 780 agtgatattgccggtactac cagcaccctt caagagcaga ttgcttggat gaccagcaac 840 ccacccatcccagtgggcga tatttacaaa aggtggatta ttctggggct gaacaaaatt 900 gtgagaatgtactcccccgt ctccatcctc gacatccgcc aaggacccaa ggagcctttt 960 agggattacgtggacagatt cttcaaaacc cttagagctg agcaagccac tcaggaggtt 1020 aagaactggatgacagatac tctgctcgtg caaaacgcta accccgattg caaaaccatc 1080 ttgagagctctcggtccagg tgccaccctt gaggaaatga tgacagcatg tcaaggcgtg 1140 ggaggacctgggcacaaggc cagagttctc gctgaggcca tgagccagac aaactcaggc 1200 aatatcatgatgcagaggag taactttaag ggtcccagga gaatcgtcaa gtgcttcaat 1260 tgtggcaaggagggtcacat tgccaggaac tgccgcgccc ccaggaagaa aggctgctgg 1320 aagtgtggcaaagagggcca ccagatgaag gattgcaccg agcgccaagc aaacttcctg 1380 ggaaagatttggcccagtca taagggccgc cctggcaact tccttcaaaa cagacccgag 1440 cctaccgccccccccgctga gtctttcaga tttgaggaga ccacccccgc tccaaagcag 1500 gagccaattgagagagagcc tctcaccagt ctcaaaagcc tctttggtag cgaccccctc 1560 agccaataagaattctagct tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg 1620 ttatcagctcacaattccac acaacatacg agccggaagc ataaagtgta aagcctggga 1680 tgcctaatgagtgagctaac tcacattagt tgcgttgcgc tcactgcccg ctttccagtc 1740 gggaaacctgtcgtgccagc tccattagtg aatcgtccaa cgcacgggga gaggcggttt 1800 gcgtattgggcgcacttccg cttcctcgct cactgactcg ctgcgctcgt tcgttcggct 1860 gcggcgagccgtatcagctc actcaaaggc ggtaatacgg ttatc 1905 8 631 PRT Humanimmunodeficiency virus type 1 8 Gly Gly Cys Ala Ala Arg Arg Leu Ser TrpVal Thr Pro Gly Phe Ser 1 5 10 15 Gln Ser Arg Arg Cys Lys Thr Thr AlaAsn Glu Leu Lys Leu Met Ala 20 25 30 Ala Arg Ala Ser Ile Leu Arg Gly GluLys Leu Asp Lys Trp Glu Lys 35 40 45 Ile Arg Leu Arg Pro Gly Gly Lys LysHis Tyr Met Leu Lys His Ile 50 55 60 Val Trp Ala Ser Arg Glu Leu Glu ArgPhe Ala Leu Asn Pro Gly Leu 65 70 75 80 Leu Glu Thr Ser Glu Gly Cys LysGln Ile Met Lys Gln Leu Gln Pro 85 90 95 Ala Leu Gln Thr Gly Thr Glu GluLeu Lys Ser Leu Tyr Asn Thr Val 100 105 110 Ala Thr Leu Tyr Cys Val HisGlu Lys Ile Glu Val Arg Asp Thr Lys 115 120 125 Glu Ala Leu Asp Lys IleGlu Glu Glu Gln Asn Lys Cys Gln Gln Lys 130 135 140 Thr Gln Gln Ala LysAla Ala Asp Gly Lys Val Ser Gln Asn Tyr Pro 145 150 155 160 Ile Val GlnAsn Leu Gln Gly Gln Met Val His Gln Ala Ile Ser Pro 165 170 175 Arg ThrLeu Asn Ala Trp Val Lys Val Ile Glu Glu Lys Ala Phe Ser 180 185 190 ProGlu Val Ile Pro Met Phe Thr Ala Leu Ser Glu Gly Ala Thr Pro 195 200 205Gln Asp Leu Asn Thr Met Leu Asn Thr Val Gly Gly His Gln Ala Ala 210 215220 Met Gln Met Leu Lys Asp Thr Ile Asn Glu Glu Ala Ala Glu Trp Asp 225230 235 240 Arg Leu His Pro Val His Ala Gly Pro Ile Ala Pro Gly Gln MetArg 245 250 255 Glu Pro Arg Gly Ser Asp Ile Ala Gly Thr Thr Ser Thr LeuGln Glu 260 265 270 Gln Ile Ala Trp Met Thr Ser Asn Pro Pro Ile Pro ValGly Asp Ile 275 280 285 Tyr Lys Arg Trp Ile Ile Leu Gly Leu Asn Lys IleVal Arg Met Tyr 290 295 300 Ser Pro Val Ser Ile Leu Asp Ile Arg Gln GlyPro Lys Glu Pro Phe 305 310 315 320 Arg Asp Tyr Val Asp Arg Phe Phe LysThr Leu Arg Ala Glu Gln Ala 325 330 335 Thr Gln Glu Val Lys Asn Trp MetThr Asp Thr Leu Leu Val Gln Asn 340 345 350 Ala Asn Pro Asp Cys Lys ThrIle Leu Arg Ala Leu Gly Pro Gly Ala 355 360 365 Thr Leu Glu Glu Met MetThr Ala Cys Gln Gly Val Gly Gly Pro Gly 370 375 380 His Lys Ala Arg ValLeu Ala Glu Ala Met Ser Gln Thr Asn Ser Gly 385 390 395 400 Asn Ile MetMet Gln Arg Ser Asn Phe Lys Gly Pro Arg Arg Ile Val 405 410 415 Lys CysPhe Asn Cys Gly Lys Glu Gly His Ile Ala Arg Asn Cys Arg 420 425 430 AlaPro Arg Lys Lys Gly Cys Trp Lys Cys Gly Lys Glu Gly His Gln 435 440 445Met Lys Asp Cys Thr Glu Arg Gln Ala Asn Phe Leu Gly Lys Ile Trp 450 455460 Pro Ser His Lys Gly Arg Pro Gly Asn Phe Leu Gln Asn Arg Pro Glu 465470 475 480 Pro Thr Ala Pro Pro Ala Glu Ser Phe Arg Phe Glu Glu Thr ThrPro 485 490 495 Ala Pro Lys Gln Glu Pro Ile Glu Arg Glu Pro Leu Thr SerLeu Lys 500 505 510 Ser Leu Phe Gly Ser Asp Pro Leu Ser Gln Glu Phe LeuGly Val Ile 515 520 525 Met Val Ile Ala Val Ser Cys Val Lys Leu Leu SerAla His Asn Ser 530 535 540 Thr Gln His Thr Ser Arg Lys His Lys Val SerLeu Gly Cys Leu Met 545 550 555 560 Ser Glu Leu Thr His Ile Ser Cys ValAla Leu Thr Ala Arg Phe Pro 565 570 575 Val Gly Lys Pro Val Val Pro AlaPro Leu Val Asn Arg Pro Thr His 580 585 590 Gly Glu Arg Arg Phe Ala TyrTrp Ala His Phe Arg Phe Leu Ala His 595 600 605 Leu Ala Ala Leu Val ArgSer Ala Ala Ala Ser Arg Ile Ser Ser Leu 610 615 620 Lys Gly Gly Asn ThrVal Ile 625 630 9 2577 DNA Human immunodeficiency virus type 1 9tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcaggcgcc 240attcgccatt caggctgcgc aactgttggg aagggcgatc ggtgcgggcc tcttcgctat 300tacgccagct ggcgaaaggg ggatgtgctg caaggcgatt aagttgggta acgccagggt 360tttcccagtc acgacgttgt aaaacgacgg ccagtgccaa gcttgcatgc ctgcaggtcg 420actctagagg atccccgggt accgagctcc ttcccacaag ggccggccag gcaatttcct 480tcagaacaga ccagagccaa cagccccacc agcagagagc ttcaggttcg aagagacaac 540ccccgctccg aaacaggagc cgagagaaag ggaaccctta acttccctca aatcactctt 600tggcagcgac cccttgtctc aataaaaatc ggcggccaga cccgggaggc cctgctggac 660accggcgccg acgacaccgt gctggaggac atcaacctgc ccggcaagtg gaagcccaag 720atgatcggcg gcatcggcgg cttcatcaag gtgcggcagt acgaccagat cctgatcgag 780atctgcggca agaaggccat cggcaccgtg ctggtgggcc ccacccccgt gaacatcatc 840ggccggaaca tgctgaccca gctgggctgc accctgaact tccccatcag ccccatcgag 900accgtgcccg tgaagctgaa gcccggcatg gacggcccca aggtgaagca gtggcccctg 960accgaggtga agatcaaggc cctgaccgcc atctgcgagg agatggagaa ggagggcaag 1020atcaccaaga tcggccccga gaacccctac aacaccccca tcttcgccat caagaaggag 1080gacagcacca agtggcggaa gctggtggac ttccgggagc tgaacaagcg gacccaggac 1140ttctgggagg tgcagctggg catcccccac cccgccggcc tgaagaagaa gaagagcgtg 1200accgtgctgg acgtgggcga cgcctacttc agcgtgcccc tggacgaggg cttccggaag 1260tacaccgcct tcaccatccc cagcatcaac aacgagaccc ccggcatccg gtaccagtac 1320aacgtgctgc cccagggctg gaagggcagc cccgccatct tccaggccag catgaccaag 1380atcctggagc ccttccgggc caagaacccc gagatcgtga tctaccagta catggccgcc 1440ctgtacgtgg gcagcgacct ggagatcggc cagcaccggg ccaagatcga ggagctgcgg 1500gagcacctgc tgaagtgggg cttcaccacc cccgacaaga agcaccagaa ggagcccccc 1560ttcctgtgga tgggctacga gctgcacccc gacaagtgga ccgtgcagcc catccagctg 1620cccgagaagg acagctggac cgtgaacgac atccagaagc tggtgggcaa gctgaactgg 1680accagccaga tctaccccgg catcaaggtg cggcagctgt gcaagctgct gcggggcacc 1740aaggccctga ccgacatcgt gcccctgacc gaggaggccg agctggagct ggccgagaac 1800cgggagatcc tgaaggagcc cgtgcacggc gtgtactacg accccagcaa ggacctgatc 1860gccgagatcc agaagcaggg cgacgaccag tggacctacc agatctacca ggagcccttc 1920aagaacctga aaaccggcaa gtacgccaag cggcggacca cccacaccaa cgacgtgaag 1980cagctgaccg aggccgtgca gaagatcagc ctggagagca tcgtgacctg gggcaagacc 2040cccaagttcc ggctgcccat ccagaaggag acctgggaga tctggtggac cgactactgg 2100caggccacct ggatccccga gtgggagttc gtgaacaccc cccccctggt gaagctgtgg 2160taccagctgg agaaggagcc catcgccggc gccgagacct tctacgtgga cggcgccgcc 2220aaccgggaga ccaagatcgg caaggccggc tacgtgaccg accggggccg gcagaagatc 2280gtgaccctga gcgagaccac caaccagaaa accgagctgc aggccatcca gctggccctg 2340caggacagcg agagcgaggt gaacatcgtg accgacagcc agtacgccct gggcatcatc 2400caggcccagc ccgaccggag cgagagcgag ctggtgaacc agatcatcga gcagctgatc 2460aagaaggagc gggcctacct gagctgggtg cccgcccaca agggcatcgg cggcgacgag 2520caggtggaca agctggtgag cagcggcatc cggaaggtgc tgtgatctag agaattc 2577 10850 PRT Human immunodeficiency virus type 1 10 Ser Arg Val Ser Val MetThr Val Lys Thr Ser Asp Thr Cys Ser Ser 1 5 10 15 Arg Arg Arg Ser GlnLeu Val Cys Lys Arg Met Pro Gly Ala Asp Lys 20 25 30 Pro Val Arg Ala ArgGln Arg Val Leu Ala Gly Val Gly Ala Gly Leu 35 40 45 Thr Met Arg His GlnSer Arg Leu Tyr Glu Cys Thr Ile Cys Gly Val 50 55 60 Lys Tyr Arg Thr AspAla Gly Glu Asn Thr Ala Ser Gly Ala Ile Arg 65 70 75 80 His Ser Gly CysAla Thr Val Gly Lys Gly Asp Arg Cys Gly Pro Leu 85 90 95 Arg Tyr Tyr AlaSer Trp Arg Lys Gly Asp Val Leu Gln Gly Asp Val 100 105 110 Gly Arg GlnGly Phe Pro Ser His Asp Val Val Lys Arg Arg Pro Val 115 120 125 Pro SerLeu His Ala Cys Arg Ser Thr Leu Glu Asp Pro Arg Val Pro 130 135 140 SerSer Phe Pro Gln Gly Pro Ala Arg Gln Phe Pro Ser Glu Gln Thr 145 150 155160 Arg Ala Asn Ser Pro Thr Ser Arg Glu Leu Gln Val Arg Arg Asp Asn 165170 175 Pro Arg Ser Glu Thr Gly Ala Glu Arg Lys Gly Thr Leu Asn Phe Pro180 185 190 Gln Ile Thr Leu Trp Gln Arg Pro Leu Val Ser Ile Lys Ile GlyGly 195 200 205 Gln Thr Arg Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp ThrVal Leu 210 215 220 Glu Asp Ile Asn Leu Pro Gly Lys Trp Lys Pro Lys MetIle Gly Gly 225 230 235 240 Ile Gly Gly Phe Ile Lys Val Arg Gln Tyr AspGln Ile Leu Ile Glu 245 250 255 Ile Cys Gly Lys Lys Ala Ile Gly Thr ValLeu Val Gly Pro Thr Pro 260 265 270 Val Asn Ile Ile Gly Arg Asn Met LeuThr Gln Leu Gly Cys Thr Leu 275 280 285 Asn Phe Pro Ile Ser Pro Ile GluThr Val Pro Val Lys Leu Lys Pro 290 295 300 Gly Met Asp Gly Pro Lys ValLys Gln Trp Pro Leu Thr Glu Val Lys 305 310 315 320 Ile Lys Ala Leu ThrAla Ile Cys Glu Glu Met Glu Lys Glu Gly Lys 325 330 335 Ile Thr Lys IleGly Pro Glu Asn Pro Tyr Asn Thr Pro Ile Phe Ala 340 345 350 Ile Lys LysGlu Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe Arg 355 360 365 Glu LeuAsn Lys Arg Thr Gln Asp Phe Trp Glu Val Gln Leu Gly Ile 370 375 380 ProHis Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu Asp 385 390 395400 Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Gly Phe Arg Lys 405410 415 Tyr Thr Ala Phe Thr Ile Pro Ser Ile Asn Asn Glu Thr Pro Gly Ile420 425 430 Arg Tyr Gln Tyr Asn Val Leu Pro Gln Gly Trp Lys Gly Ser ProAla 435 440 445 Ile Phe Gln Ala Ser Met Thr Lys Ile Leu Glu Pro Phe ArgAla Lys 450 455 460 Asn Pro Glu Ile Val Ile Tyr Gln Tyr Met Ala Ala LeuTyr Val Gly 465 470 475 480 Ser Asp Leu Glu Ile Gly Gln His Arg Ala LysIle Glu Glu Leu Arg 485 490 495 Glu His Leu Leu Lys Trp Gly Phe Thr ThrPro Asp Lys Lys His Gln 500 505 510 Lys Glu Pro Pro Phe Leu Trp Met GlyTyr Glu Leu His Pro Asp Lys 515 520 525 Trp Thr Val Gln Pro Ile Gln LeuPro Glu Lys Asp Ser Trp Thr Val 530 535 540 Asn Asp Ile Gln Lys Leu ValGly Lys Leu Asn Trp Thr Ser Gln Ile 545 550 555 560 Tyr Pro Gly Ile LysVal Arg Gln Leu Cys Lys Leu Leu Arg Gly Thr 565 570 575 Lys Ala Leu ThrAsp Ile Val Pro Leu Thr Glu Glu Ala Glu Leu Glu 580 585 590 Leu Ala GluAsn Arg Glu Ile Leu Lys Glu Pro Val His Gly Val Tyr 595 600 605 Tyr AspPro Ser Lys Asp Leu Ile Ala Glu Ile Gln Lys Gln Gly Asp 610 615 620 AspGln Trp Thr Tyr Gln Ile Tyr Gln Glu Pro Phe Lys Asn Leu Lys 625 630 635640 Thr Gly Lys Tyr Ala Lys Arg Arg Thr Thr His Thr Asn Asp Val Lys 645650 655 Gln Leu Thr Glu Ala Val Gln Lys Ile Ser Leu Glu Ser Ile Val Thr660 665 670 Trp Gly Lys Thr Pro Lys Phe Arg Leu Pro Ile Gln Lys Glu ThrTrp 675 680 685 Glu Ile Trp Trp Thr Asp Tyr Trp Gln Ala Thr Trp Ile ProGlu Trp 690 695 700 Glu Phe Val Asn Thr Pro Pro Leu Val Lys Leu Trp TyrGln Leu Glu 705 710 715 720 Lys Glu Pro Ile Ala Gly Ala Glu Thr Phe TyrVal Asp Gly Ala Ala 725 730 735 Asn Arg Glu Thr Lys Ile Gly Lys Ala GlyTyr Val Thr Asp Arg Gly 740 745 750 Arg Gln Lys Ile Val Thr Leu Ser GluThr Thr Asn Gln Lys Thr Glu 755 760 765 Leu Gln Ala Ile Gln Leu Ala LeuGln Asp Ser Glu Ser Glu Val Asn 770 775 780 Ile Val Thr Asp Ser Gln TyrAla Leu Gly Ile Ile Gln Ala Gln Pro 785 790 795 800 Asp Arg Ser Glu SerGlu Leu Val Asn Gln Ile Ile Glu Gln Leu Ile 805 810 815 Lys Lys Glu ArgAla Tyr Leu Ser Trp Val Pro Ala His Lys Gly Ile 820 825 830 Gly Gly AspGlu Gln Val Asp Lys Leu Val Ser Ser Gly Ile Arg Lys 835 840 845 Val Leu850 11 2564 DNA Human immunodeficiency virus type 1 11 aagcttatgagggttatggg gattcagaga aactggcctc agtggtggat ttgggggaca 60 ttgggattttggatgatcat catctgtcgc gtcgtgggca acctgaacct gtgggtcact 120 gtctactatggagtgccagt ttggaaggaa gccaagacaa ctctgttttg cgccagcgac 180 gccaaggcttatgacaagga agtccacaac gtgtgggcca cccacgcatg tgtcccaacc 240 gaccccaacccacgcgaaat cgtgctggaa aacgtcacag aaaatttcaa catgtggaaa 300 aacgatatggtggatcagat gcatgaggat attattagcc tctgggacca gtctctgaag 360 ccatgtgtgaagttgacacc tctctgtgtg acccttaact gtactaacgc ccccgcctat 420 aacaactctatgcacgggga gatgaaaaac tgttccttca acaccaccac cgaaatcagg 480 gacagaaaacagaaagccta tgccctgttc tataagcccg atgtggtgcc acttaaccgc 540 cgcgaagaaaataatggtac tggcgaatat attctgatta actgtaacag ctctacaatt 600 actcaggcttgccctaaagt cacctttgac ccaatcccaa tccactactg cgcccctgca 660 ggatacgctatcctgaaatg caataataag accttcaacg gaactggacc ctgcaataac 720 gtgtctacagtgcaatgtac ccacggcatt atgcccgtcg tctccaccca actgctgctc 780 aatggcagcttggcagaaga ggagatcatt attaggagcg aaaacctcac caacaatatc 840 aagacaatcatcgtgcacct gaacaagtct gtggaaattg tgtgtaccag gcccaataac 900 aacaccaggaagagcatccg catcggacct ggacaaactt tctacgccac cggcgaaatc 960 atcgggaacattagagaagc ccactgcaac atctctaaga gcaattggac atctacattg 1020 gagcaagtgaaaaaaaagct gaaagagcac tacaataaga ccatcgagtt caaccctcct 1080 tccggcggcgatctggaggt cacaacacac tcctttaact gtagggggga gttcttttac 1140 tgcaacacaacaaagctgtt tagcaacaac tccgacagca ataatgagac tatcaccctg 1200 ccttgcaagatcaagcaaat cattaacatg tggcagaaag tgggaagggc aatgtatgca 1260 cctcccatcgagggcaacat cacatgcaag tctaatatca ccggcctgtt gctgactaga 1320 gacggtggcaagaatactac taacgaaatc ttcaggccag gtggagggaa catgaaagat 1380 aattggcgctccgaactgta taagtacaag gtggtggaga ttgagcccct cggcgtcgcc 1440 cccacaaagtctaagcgccg cgtggtggaa agagagaaga gggctgtcgg cctcggcgca 1500 gtgctgctggggttcttggg tgccgctggg tctacaatgg gcgctgcctc tattacactc 1560 accgtgcaagctaggcagct gctgtccggt attgtgcaac aacagagcaa tctcttgaga 1620 gctatcgaggcccagcagca tatgctgcaa cttacagtgt ggggtattaa gcagctgcaa 1680 actcgcgtcctggcaatcga acgctacctg aaagaccagc aactcctggg tctgtggggc 1740 tgctccggtaagatcatctg taccacagcc gtgccctgga acagcagctg gtccaataag 1800 agccaagaggatatttggga taatatgacc tggatgcaat gggatagaga gatcagcaac 1860 tacacaggaaccatttatag gctcctggaa gattctcaga accagcagga gaagaacgag 1920 aaggacttgctcgccctgga tagctggaaa aacctgtgga attggtttaa catcaccaac 1980 tggctttggtacattaagat tttcatcatg attgtgggag gcttgatcgg cctgaggatt 2040 atcttcggggtgcttgccat tgtgaaaagg gtcagacaag gatactcccc attgtccttt 2100 cagaccttgactccaagccc acgcggaccc gacaggttgg gcaggatcga ggaggaagga 2160 ggcgaacaggataaggaccg ctccatcaga cttgttagcg ggtttctggc cctggcctgg 2220 gatgatctgaggagcctgtg cctcttctcc tatcaccacc tccgcgattt catcctcatt 2280 gcagctagggctgctgagtt gctgggacgc tcctccctga gaggtctcca gagaggctgg 2340 gaggcactgaagtacctcgg gaaccttgtg caatacggcg ggctggagct gaaaagatcc 2400 gccatcaagctgttcgacac catcgcaatc gccgttgcag agggcaccga caggatcttg 2460 gaggtcattcagaggatctg tcgcgccatc cgccacatcc ccatcaggat cagacaagga 2520 ttcgaggcagcactgcaatg atagttaatt aaacgcgtgg atcc 2564 12 852 PRT Humanimmunodeficiency virus type 1 12 Lys Leu Met Arg Val Met Gly Ile Gln ArgAsn Trp Pro Gln Trp Trp 1 5 10 15 Ile Trp Gly Thr Leu Gly Phe Trp MetIle Ile Ile Cys Arg Val Val 20 25 30 Gly Asn Leu Asn Leu Trp Val Thr ValTyr Tyr Gly Val Pro Val Trp 35 40 45 Lys Glu Ala Lys Thr Thr Leu Phe CysAla Ser Asp Ala Lys Ala Tyr 50 55 60 Asp Lys Glu Val His Asn Val Trp AlaThr His Ala Cys Val Pro Thr 65 70 75 80 Asp Pro Asn Pro Arg Glu Ile ValLeu Glu Asn Val Thr Glu Asn Phe 85 90 95 Asn Met Trp Lys Asn Asp Met ValAsp Gln Met His Glu Asp Ile Ile 100 105 110 Ser Leu Trp Asp Gln Ser LeuLys Pro Cys Val Lys Leu Thr Pro Leu 115 120 125 Cys Val Thr Leu Asn CysThr Asn Ala Pro Ala Tyr Asn Asn Ser Met 130 135 140 His Gly Glu Met LysAsn Cys Ser Phe Asn Thr Thr Thr Glu Ile Arg 145 150 155 160 Asp Arg LysGln Lys Ala Tyr Ala Leu Phe Tyr Lys Pro Asp Val Val 165 170 175 Pro LeuAsn Arg Arg Glu Glu Asn Asn Gly Thr Gly Glu Tyr Ile Leu 180 185 190 IleAsn Cys Asn Ser Ser Thr Ile Thr Gln Ala Cys Pro Lys Val Thr 195 200 205Phe Asp Pro Ile Pro Ile His Tyr Cys Ala Pro Ala Gly Tyr Ala Ile 210 215220 Leu Lys Cys Asn Asn Lys Thr Phe Asn Gly Thr Gly Pro Cys Asn Asn 225230 235 240 Val Ser Thr Val Gln Cys Thr His Gly Ile Met Pro Val Val SerThr 245 250 255 Gln Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Glu Ile IleIle Arg 260 265 270 Ser Glu Asn Leu Thr Asn Asn Ile Lys Thr Ile Ile ValHis Leu Asn 275 280 285 Lys Ser Val Glu Ile Val Cys Thr Arg Pro Asn AsnAsn Thr Arg Lys 290 295 300 Ser Ile Arg Ile Gly Pro Gly Gln Thr Phe TyrAla Thr Gly Glu Ile 305 310 315 320 Ile Gly Asn Ile Arg Glu Ala His CysAsn Ile Ser Lys Ser Asn Trp 325 330 335 Thr Ser Thr Leu Glu Gln Val LysLys Lys Leu Lys Glu His Tyr Asn 340 345 350 Lys Thr Ile Glu Phe Asn ProPro Ser Gly Gly Asp Leu Glu Val Thr 355 360 365 Thr His Ser Phe Asn CysArg Gly Glu Phe Phe Tyr Cys Asn Thr Thr 370 375 380 Lys Leu Phe Ser AsnAsn Ser Asp Ser Asn Asn Glu Thr Ile Thr Leu 385 390 395 400 Pro Cys LysIle Lys Gln Ile Ile Asn Met Trp Gln Lys Val Gly Arg 405 410 415 Ala MetTyr Ala Pro Pro Ile Glu Gly Asn Ile Thr Cys Lys Ser Asn 420 425 430 IleThr Gly Leu Leu Leu Thr Arg Asp Gly Gly Lys Asn Thr Thr Asn 435 440 445Glu Ile Phe Arg Pro Gly Gly Gly Asn Met Lys Asp Asn Trp Arg Ser 450 455460 Glu Leu Tyr Lys Tyr Lys Val Val Glu Ile Glu Pro Leu Gly Val Ala 465470 475 480 Pro Thr Lys Ser Lys Arg Arg Val Val Glu Arg Glu Lys Arg AlaVal 485 490 495 Gly Leu Gly Ala Val Leu Leu Gly Phe Leu Gly Ala Ala GlySer Thr 500 505 510 Met Gly Ala Ala Ser Ile Thr Leu Thr Val Gln Ala ArgGln Leu Leu 515 520 525 Ser Gly Ile Val Gln Gln Gln Ser Asn Leu Leu ArgAla Ile Glu Ala 530 535 540 Gln Gln His Met Leu Gln Leu Thr Val Trp GlyIle Lys Gln Leu Gln 545 550 555 560 Thr Arg Val Leu Ala Ile Glu Arg TyrLeu Lys Asp Gln Gln Leu Leu 565 570 575 Gly Leu Trp Gly Cys Ser Gly LysIle Ile Cys Thr Thr Ala Val Pro 580 585 590 Trp Asn Ser Ser Trp Ser AsnLys Ser Gln Glu Asp Ile Trp Asp Asn 595 600 605 Met Thr Trp Met Gln TrpAsp Arg Glu Ile Ser Asn Tyr Thr Gly Thr 610 615 620 Ile Tyr Arg Leu LeuGlu Asp Ser Gln Asn Gln Gln Glu Lys Asn Glu 625 630 635 640 Lys Asp LeuLeu Ala Leu Asp Ser Trp Lys Asn Leu Trp Asn Trp Phe 645 650 655 Asn IleThr Asn Trp Leu Trp Tyr Ile Lys Ile Phe Ile Met Ile Val 660 665 670 GlyGly Leu Ile Gly Leu Arg Ile Ile Phe Gly Val Leu Ala Ile Val 675 680 685Lys Arg Val Arg Gln Gly Tyr Ser Pro Leu Ser Phe Gln Thr Leu Thr 690 695700 Pro Ser Pro Arg Gly Pro Asp Arg Leu Gly Arg Ile Glu Glu Glu Gly 705710 715 720 Gly Glu Gln Asp Lys Asp Arg Ser Ile Arg Leu Val Ser Gly PheLeu 725 730 735 Ala Leu Ala Trp Asp Asp Leu Arg Ser Leu Cys Leu Phe SerTyr His 740 745 750 His Leu Arg Asp Phe Ile Leu Ile Ala Ala Arg Ala AlaGlu Leu Leu 755 760 765 Gly Arg Ser Ser Leu Arg Gly Leu Gln Arg Gly TrpGlu Ala Leu Lys 770 775 780 Tyr Leu Gly Asn Leu Val Gln Tyr Gly Gly LeuGlu Leu Lys Arg Ser 785 790 795 800 Ala Ile Lys Leu Phe Asp Thr Ile AlaIle Ala Val Ala Glu Gly Thr 805 810 815 Asp Arg Ile Leu Glu Val Ile GlnArg Ile Cys Arg Ala Ile Arg His 820 825 830 Ile Pro Ile Arg Ile Arg GlnGly Phe Glu Ala Ala Leu Gln Leu Ile 835 840 845 Lys Arg Val Asp 850 132579 DNA Human immunodeficiency virus type 1 13 aggctaattt tttagggaaaatttggcctt cccacaaggg gaggccaggg aatttccttc 60 agagcaggcc aatgagagtgagggggatac agaggaattg gccacaatgg tggatatggg 120 gcatcttagg cttttggatgttaatgattt gtagtggggt gggaaacttg tgggtcacaa 180 tctattatgg ggtacctgtgtggagagaag caaaaactac tctattctgt gcatcagatg 240 ctaaagcata tgatagagaagtgcataatg tctgggctac acatgcctgt gtacccacag 300 accccaaccc acaagaaatagttatgggaa atgtaacaga aaattttaac atgtggaaaa 360 atgacatggt ggatcagatgcatgaggata taatcaattt atgggatcaa agcctaaagc 420 catgtgtaaa gttaaccccactctgtgtca ctttaaaatg tagtacctat aatggtagtg 480 ataccaacga tatgagaaattgctctttca atacaactac agaaataagg gacaagaaac 540 agacagtgta tgcacttttttataaacctg atatagtacc aattaatgag agtgagtata 600 tattaataca ttgcaatacctcaaccataa cacaagcctg tccaaaggtc tcttttgacc 660 caattcctat acattattgtgctccagctg gttatgcgat tctaaagtgt aataataaga 720 cattcaatgg gacgggaccatgccaaaatg tcagcacagt acaatgcaca catggaatta 780 agccagtagt atcaactcaactactgttaa atggtagcat agcagaagga gagataataa 840 ttagatctga aaatctgacaaacaatgtta aaacaataat agtacacctt aatgaatcta 900 taggaattgt gtgtacaagacccggcaata atacaagaaa aagtataagg ataggaccag 960 gacaagcatt ctatacaaatcacataatag gagatataag acaagcatat tgtaacatta 1020 gtaaacaaga atggaacaaaactttagaag aggtgagaaa aaaattgcaa gaacacttcc 1080 caaataaaac aataaaatttaactcatcct caggagggga cctagaaatt acaacacata 1140 gctttaattg cagaggagaatttttctatt gcaatacatc aaaactattt aatgatagtc 1200 tagtaaatga tacagaaagtaattcaacca tcactattcc atgcagaata aaacaaatta 1260 taaacatgtg gcaggaggtaggacgagcaa tgtatgcccc tcccattgca ggaaacataa 1320 catgtaaatc aaatatcacaggactactat tgacacgtga tggaggaaca gataacacaa 1380 cagagatatt cagacctggaggaggaaata tgaaggacaa ttggagaagt gaattatata 1440 aatataaagt agtagaaattaagccattgg gaatagcacc cactgaagca aaaaggagag 1500 tggtggagag agaaaaaagagcagtgggaa taggagctgt gctccttggg ttcttgggag 1560 cagcaggaag cactatgggcgcggcgtcaa taacgctgac ggtacaggcc agacaactgt 1620 tgtctggtat agtgcaacagcaaagcaatt tgctgagagc tatagaggcg caacagcata 1680 tgttgcaact cacagtctggggcattaagc agctccagac aagagtcctg gctatagaaa 1740 gatacctaaa ggatcaacagctcctaggac tttggggctg ctctggaaaa ctcatctgca 1800 ccactaatgt gccttggaactccagttgga gcaataaatc tcaacaagct atttgggata 1860 acatgacatg gatgcagtgggatagagaaa ttaataatta cacaaacata atataccagt 1920 tgcttgagga ctcgcaaatccagcaggaac agaatgaaaa agatttatta gcattggaca 1980 agtggcaaaa tctgtggagttggtttagca taacaaattg gctatggtat ataaaaatat 2040 tcataatgat agtaggaggcttaataggtt taagaataat ttttgctgtg ctatctatag 2100 taaatagagt taggcagggatactcacctt tgtcgtttca gacccttacc ccaaacccga 2160 ggggacccga caggctcggagaaatcgaag aagaaggtgg agagcaagac agagacagat 2220 ccgttcgatt agtgagcggattcttaccac ttgcctggga cgatctgcgg agcctgtgcc 2280 tcttcagcta ccaccgattgagagacttca tattcgattg cagcgaggac agtggaactt 2340 ctgggacgca gcagtctcaggggactccag aggggtggga agtccttaaa tatctgggaa 2400 gccttgtgca gtattggggtctggagctaa aaagagtgct attagtctgc ttgataccca 2460 tagcaatagc agtagctgaaggaacagata ggattattga attagtacta agattttgta 2520 gagctatccg caacatacctacaagagtaa gacagggctg tgaagcagct ttgctataa 2579 14 858 PRT Humanimmunodeficiency virus type 1 14 Ala Asn Phe Leu Gly Lys Ile Trp Pro SerHis Lys Gly Arg Pro Gly 1 5 10 15 Asn Phe Leu Gln Ser Arg Pro Met ArgVal Arg Gly Ile Gln Arg Asn 20 25 30 Trp Pro Gln Trp Trp Ile Trp Gly IleLeu Gly Phe Trp Met Leu Met 35 40 45 Ile Cys Ser Gly Val Gly Asn Leu TrpVal Thr Ile Tyr Tyr Gly Val 50 55 60 Pro Val Trp Arg Glu Ala Lys Thr ThrLeu Phe Cys Ala Ser Asp Ala 65 70 75 80 Lys Ala Tyr Asp Arg Glu Val HisAsn Val Trp Ala Thr His Ala Cys 85 90 95 Val Pro Thr Asp Pro Asn Pro GlnGlu Ile Val Met Gly Asn Val Thr 100 105 110 Glu Asn Phe Asn Met Trp LysAsn Asp Met Val Asp Gln Met His Glu 115 120 125 Asp Ile Ile Asn Leu TrpAsp Gln Ser Leu Lys Pro Cys Val Lys Leu 130 135 140 Thr Pro Leu Cys ValThr Leu Lys Cys Ser Thr Tyr Asn Gly Ser Asp 145 150 155 160 Thr Asn AspMet Arg Asn Cys Ser Phe Asn Thr Thr Thr Glu Ile Arg 165 170 175 Asp LysLys Gln Thr Val Tyr Ala Leu Phe Tyr Lys Pro Asp Ile Val 180 185 190 ProIle Asn Glu Ser Glu Tyr Ile Leu Ile His Cys Asn Thr Ser Thr 195 200 205Ile Thr Gln Ala Cys Pro Lys Val Ser Phe Asp Pro Ile Pro Ile His 210 215220 Tyr Cys Ala Pro Ala Gly Tyr Ala Ile Leu Lys Cys Asn Asn Lys Thr 225230 235 240 Phe Asn Gly Thr Gly Pro Cys Gln Asn Val Ser Thr Val Gln CysThr 245 250 255 His Gly Ile Lys Pro Val Val Ser Thr Gln Leu Leu Leu AsnGly Ser 260 265 270 Ile Ala Glu Gly Glu Ile Ile Ile Arg Ser Glu Asn LeuThr Asn Asn 275 280 285 Val Lys Thr Ile Ile Val His Leu Asn Glu Ser IleGly Ile Val Cys 290 295 300 Thr Arg Pro Gly Asn Asn Thr Arg Lys Ser IleArg Ile Gly Pro Gly 305 310 315 320 Gln Ala Phe Tyr Thr Asn His Ile IleGly Asp Ile Arg Gln Ala Tyr 325 330 335 Cys Asn Ile Ser Lys Gln Glu TrpAsn Lys Thr Leu Glu Glu Val Arg 340 345 350 Lys Lys Leu Gln Glu His PhePro Asn Lys Thr Ile Lys Phe Asn Ser 355 360 365 Ser Ser Gly Gly Asp LeuGlu Ile Thr Thr His Ser Phe Asn Cys Arg 370 375 380 Gly Glu Phe Phe TyrCys Asn Thr Ser Lys Leu Phe Asn Asp Ser Leu 385 390 395 400 Val Asn AspThr Glu Ser Asn Ser Thr Ile Thr Ile Pro Cys Arg Ile 405 410 415 Lys GlnIle Ile Asn Met Trp Gln Glu Val Gly Arg Ala Met Tyr Ala 420 425 430 ProPro Ile Ala Gly Asn Ile Thr Cys Lys Ser Asn Ile Thr Gly Leu 435 440 445Leu Leu Thr Arg Asp Gly Gly Thr Asp Asn Thr Thr Glu Ile Phe Arg 450 455460 Pro Gly Gly Gly Asn Met Lys Asp Asn Trp Arg Ser Glu Leu Tyr Lys 465470 475 480 Tyr Lys Val Val Glu Ile Lys Pro Leu Gly Ile Ala Pro Thr GluAla 485 490 495 Lys Arg Arg Val Val Glu Arg Glu Lys Arg Ala Val Gly IleGly Ala 500 505 510 Val Leu Leu Gly Phe Leu Gly Ala Ala Gly Ser Thr MetGly Ala Ala 515 520 525 Ser Ile Thr Leu Thr Val Gln Ala Arg Gln Leu LeuSer Gly Ile Val 530 535 540 Gln Gln Gln Ser Asn Leu Leu Arg Ala Ile GluAla Gln Gln His Met 545 550 555 560 Leu Gln Leu Thr Val Trp Gly Ile LysGln Leu Gln Thr Arg Val Leu 565 570 575 Ala Ile Glu Arg Tyr Leu Lys AspGln Gln Leu Leu Gly Leu Trp Gly 580 585 590 Cys Ser Gly Lys Leu Ile CysThr Thr Asn Val Pro Trp Asn Ser Ser 595 600 605 Trp Ser Asn Lys Ser GlnGln Ala Ile Trp Asp Asn Met Thr Trp Met 610 615 620 Gln Trp Asp Arg GluIle Asn Asn Tyr Thr Asn Ile Ile Tyr Gln Leu 625 630 635 640 Leu Glu AspSer Gln Ile Gln Gln Glu Gln Asn Glu Lys Asp Leu Leu 645 650 655 Ala LeuAsp Lys Trp Gln Asn Leu Trp Ser Trp Phe Ser Ile Thr Asn 660 665 670 TrpLeu Trp Tyr Ile Lys Ile Phe Ile Met Ile Val Gly Gly Leu Ile 675 680 685Gly Leu Arg Ile Ile Phe Ala Val Leu Ser Ile Val Asn Arg Val Arg 690 695700 Gln Gly Tyr Ser Pro Leu Ser Phe Gln Thr Leu Thr Pro Asn Pro Arg 705710 715 720 Gly Pro Asp Arg Leu Gly Glu Ile Glu Glu Glu Gly Gly Glu GlnAsp 725 730 735 Arg Asp Arg Ser Val Arg Leu Val Ser Gly Phe Leu Pro LeuAla Trp 740 745 750 Asp Asp Leu Arg Ser Leu Cys Leu Phe Ser Tyr His ArgLeu Arg Asp 755 760 765 Phe Ile Phe Asp Cys Ser Glu Asp Ser Gly Thr SerGly Thr Gln Gln 770 775 780 Ser Gln Gly Thr Pro Glu Gly Trp Glu Val LeuLys Tyr Leu Gly Ser 785 790 795 800 Leu Val Gln Tyr Trp Gly Leu Glu LeuLys Arg Val Leu Leu Val Cys 805 810 815 Leu Ile Pro Ile Ala Ile Ala ValAla Glu Gly Thr Asp Arg Ile Ile 820 825 830 Glu Leu Val Leu Arg Phe CysArg Ala Ile Arg Asn Ile Pro Thr Arg 835 840 845 Val Arg Gln Gly Cys GluAla Ala Leu Leu 850 855 15 311 PRT Human immunodeficiency virus type 115 Gly Glu Lys Leu Asp Lys Trp Glu Lys Ile Arg Leu Arg Pro Gly Gly 1 510 15 Lys Lys His Tyr Met Leu Lys His Leu Val Trp Ala Ser Arg Glu Leu 2025 30 Glu Arg Phe Ala Leu Asn Pro Gly Leu Leu Glu Thr Ser Glu Gly Cys 3540 45 Lys Gln Ile Met Lys Gln Leu Gln Pro Ala Leu Gln Thr Gly Thr Glu 5055 60 Glu Leu Arg Ser Leu Tyr Asn Thr Val Ala Thr Leu Tyr Cys Val His 6570 75 80 Glu Lys Ile Glu Val Arg Asp Thr Lys Glu Ala Leu Asp Lys Ile Glu85 90 95 Glu Glu Gln Asn Lys Ser Gln Gln Cys Gln Gln Lys Thr Gln Gln Ala100 105 110 Lys Ala Ala Asp Gly Gly Lys Val Ser Gln Asn Tyr Pro Ile ValGln 115 120 125 Asn Leu Gln Gly Gln Met Val His Gln Ala Ile Ser Pro ArgThr Leu 130 135 140 Asn Ala Trp Val Lys Val Ile Glu Glu Lys Ala Phe SerPro Glu Val 145 150 155 160 Ile Pro Met Phe Thr Ala Leu Ser Glu Gly AlaThr Pro Gln Asp Leu 165 170 175 Asn Thr Met Leu Asn Thr Val Gly Gly HisGln Ala Ala Met Gln Met 180 185 190 Leu Lys Asp Thr Ile Asn Glu Glu AlaAla Glu Trp Asp Arg Leu His 195 200 205 Pro Val His Ala Gly Pro Ile AlaPro Gly Gln Met Arg Glu Pro Arg 210 215 220 Gly Ser Asp Ile Ala Gly ThrThr Ser Thr Leu Gln Glu Gln Ile Ala 225 230 235 240 Trp Met Thr Ser AsnPro Pro Ile Pro Val Gly Asp Ile Tyr Lys Arg 245 250 255 Trp Ile Ile LeuGly Leu Asn Lys Ile Val Arg Met Tyr Ser Pro Val 260 265 270 Ser Ile LeuAsp Ile Lys Gln Gly Pro Lys Glu Pro Phe Arg Asp Tyr 275 280 285 Val AspArg Phe Phe Lys Thr Leu Arg Ala Glu Gln Ala Thr Gln Asp 290 295 300 ValLys Asn Trp Met Thr Asp 305 310 16 277 PRT Human immunodeficiency virustype 1 16 Leu Thr Glu Glu Lys Ile Lys Ala Leu Thr Ala Ile Cys Glu GluMet 1 5 10 15 Glu Lys Glu Gly Lys Ile Thr Lys Ile Gly Pro Glu Asn ProTyr Asn 20 25 30 Thr Pro Val Phe Ala Ile Lys Lys Lys Asp Ser Thr Lys TrpArg Lys 35 40 45 Leu Val Asp Phe Arg Glu Leu Asn Lys Arg Thr Gln Asp PheTrp Glu 50 55 60 Val Gln Leu Gly Ile Pro His Pro Ala Gly Leu Lys Lys LysLys Ser 65 70 75 80 Val Thr Val Leu Asp Val Gly Asp Ala Tyr Phe Ser ValPro Leu Asp 85 90 95 Glu Gly Phe Arg Lys Tyr Thr Ala Phe Thr Ile Pro SerIle Asn Asn 100 105 110 Glu Thr Pro Gly Ile Arg Tyr Gln Tyr Asn Val LeuPro Gln Gly Trp 115 120 125 Lys Gly Ser Pro Ala Ile Phe Gln Ser Ser MetThr Lys Ile Leu Glu 130 135 140 Pro Phe Arg Ala Lys Asn Pro Glu Ile ValIle Tyr Gln Tyr Met Asp 145 150 155 160 Asp Leu Tyr Val Gly Ser Asp LeuGlu Ile Gly Gln His Arg Ala Lys 165 170 175 Ile Glu Glu Leu Arg Glu HisLeu Leu Lys Trp Gly Phe Thr Thr Pro 180 185 190 Asp Lys Lys His Gln LysGlu Pro Pro Phe Leu Trp Met Gly Tyr Glu 195 200 205 Leu His Pro Asp LysTrp Thr Val Gln Pro Ile Gln Leu Pro Glu Lys 210 215 220 Asp Ser Trp ThrVal Asn Asp Ile Gln Lys Leu Val Gly Lys Leu Asn 225 230 235 240 Trp AlaSer Gln Ile Tyr Pro Gly Ile Lys Val Arg Gln Leu Cys Lys 245 250 255 LeuLeu Arg Gly Ala Lys Ala Leu Thr Asp Ile Val Pro Leu Thr Glu 260 265 270Glu Ala Glu Leu Glu 275 17 229 PRT Human immunodeficiency virus type 117 Tyr Cys Ala Pro Ala Gly Tyr Ala Ile Leu Lys Cys Asn Asn Lys Thr 1 510 15 Phe Asn Gly Thr Gly Pro Cys Asn Asn Val Ser Thr Val Gln Cys Thr 2025 30 His Gly Ile Lys Pro Val Val Ser Thr Gln Leu Leu Leu Asn Gly Ser 3540 45 Leu Ala Glu Glu Glu Ile Ile Ile Arg Ser Glu Asn Leu Thr Asn Asn 5055 60 Ala Lys Thr Ile Ile Val His Leu Asn Glu Ser Val Glu Ile Val Cys 6570 75 80 Thr Arg Pro Asn Asn Asn Thr Arg Lys Ser Ile Arg Ile Gly Pro Gly85 90 95 Gln Thr Phe Tyr Ala Thr Gly Asp Ile Ile Gly Asp Ile Arg Gln Ala100 105 110 His Cys Asn Ile Ser Glu Gly Lys Trp Asn Lys Thr Leu Gln LysVal 115 120 125 Lys Lys Lys Leu Lys Glu Glu Leu Tyr Lys Tyr Lys Val ValGlu Ile 130 135 140 Lys Pro Leu Gly Ile Ala Pro Thr Glu Ala Lys Arg ArgVal Val Glu 145 150 155 160 Arg Glu Lys Arg Ala Val Gly Ile Gly Ala ValPhe Leu Gly Phe Leu 165 170 175 Gly Ala Ala Gly Ser Thr Met Gly Ala AlaSer Ile Thr Leu Thr Val 180 185 190 Gln Ala Arg Gln Leu Leu Ser Gly IleVal Gln Gln Gln Ser Asn Leu 195 200 205 Leu Arg Ala Ile Glu Ala Gln GlnHis Met Leu Gln Leu Thr Val Trp 210 215 220 Gly Ile Lys Gln Leu 225 181479 RNA Human immunodeficiency virus type 1 18 augggugcga gagcgucaauauuaagaggg gaaaaauuag auaaauggga aaaaauuagg 60 uuaaggccag ggggaaagaaacauuauaug uuaaaacaca uaguaugggc aagcagggag 120 cuggaaagau uugcacuuaacccuggccuu uuagaaacau cagaaggaug uaaacaaaua 180 augaaacagc uacaaccagcucuccagaca ggaacagagg aacuuaaauc auuauacaac 240 acaguagcaa cucucuauuguguacaugaa aagauagaag uacgagacac caaggaagcc 300 uuagauaaga uagaggaagaacaaaacaaa ugucagcaaa aaacgcagca ggcaaaagcg 360 gcugacggga aagucagucaaaauuauccu auagugcaga aucuccaagg gcaaauggua 420 caucaagcca uaucaccuagaaccuugaau gcauggguaa aaguaauaga agaaaaggcu 480 uuuagcccag agguaauacccauguuuaca gcauuaucag aaggagccac cccacaagau 540 uuaaacacca uguuaaauacagugggggga caucaagcag ccaugcaaau guuaaaagau 600 acuauuaaug aagaggcugcagaaugggau agaguacauc caguccaugc ggggccuauu 660 gcaccaggcc agaugagagaaccaagggga agugacauag caggaacuac uaguacccuu 720 caggaacaaa uagcauggaugacaaguaac ccaccuauuc cagugggaga caucuauaaa 780 agauggauaa uucugggguuaaauaaaaua gugagaaugu auagcccggu cagcauuuug 840 gacauaagac aagggccaaaggaacccuuu cgagacuaug uagaucgguu cuuuaaaacu 900 uuaagagcug aacaagcuacacaagaagua aaaaauugga ugacagacac cuuguuaguc 960 caaaaugcga acccagauuguaagaccauu uugagagcau uaggaccagg ggcuacauua 1020 gaagaaauga ugacagcaugucaaggggug ggaggaccug gucacaaagc aagaguauug 1080 gcugaggcaa ugagucaagcaaacagugga aacauaauga ugcagagaag caauuuuaaa 1140 ggcccuagaa gaauuguuaaauguuuuaac uguggcaagg aagggcacau agccagaaau 1200 ugcagagccc cuaggaaaaaaggcuguugg aaauguggaa aggaaggaca ccaaaugaaa 1260 gacuguacug aaaggcaggcuaauuuuuua gggaaaauuu ggccuuccca caaggggagg 1320 ccagggaauu uccuucagaacagaccagag ccaacagccc caccagcaga gagcuucagg 1380 uucgaagaga caacccccgcuccgaaacag gagccgauag aaagggaacc cuuaacuucc 1440 cucaaaucac ucuuuggcagcgaccccuug ucucaauaa 1479 19 1479 DNA Human immunodeficiency virus type1 19 ttattgagac aaggggtcgc tgccaaagag tgatttgagg gaagttaagg gttccctttc60 tatcggctcc tgtttcggag cgggggttgt ctcttcgaac ctgaagctct ctgctggtgg 120ggctgttggc tctggtctgt tctgaaggaa attccctggc ctccccttgt gggaaggcca 180aattttccct aaaaaattag cctgcctttc agtacagtct ttcatttggt gtccttcctt 240tccacatttc caacagcctt ttttcctagg ggctctgcaa tttctggcta tgtgcccttc 300cttgccacag ttaaaacatt taacaattct tctagggcct ttaaaattgc ttctctgcat 360cattatgttt ccactgtttg cttgactcat tgcctcagcc aatactcttg ctttgtgacc 420aggtcctccc accccttgac atgctgtcat catttcttct aatgtagccc ctggtcctaa 480tgctctcaaa atggtcttac aatctgggtt cgcattttgg actaacaagg tgtctgtcat 540ccaatttttt acttcttgtg tagcttgttc agctcttaaa gttttaaaga accgatctac 600atagtctcga aagggttcct ttggcccttg tcttatgtcc aaaatgctga ccgggctata 660cattctcact attttattta accccagaat tatccatctt ttatagatgt ctcccactgg 720aataggtggg ttacttgtca tccatgctat ttgttcctga agggtactag tagttcctgc 780tatgtcactt ccccttggtt ctctcatctg gcctggtgca ataggccccg catggactgg 840atgtactcta tcccattctg cagcctcttc attaatagta tcttttaaca tttgcatggc 900tgcttgatgt ccccccactg tatttaacat ggtgtttaaa tcttgtgggg tggctccttc 960tgataatgct gtaaacatgg gtattacctc tgggctaaaa gccttttctt ctattacttt 1020tacccatgca ttcaaggttc taggtgatat ggcttgatgt accatttgcc cttggagatt 1080ctgcactata ggataatttt gactgacttt cccgtcagcc gcttttgcct gctgcgtttt 1140ttgctgacat ttgttttgtt cttcctctat cttatctaag gcttccttgg tgtctcgtac 1200ttctatcttt tcatgtacac aatagagagt tgctactgtg ttgtataatg atttaagttc 1260ctctgttcct gtctggagag ctggttgtag ctgtttcatt atttgtttac atccttctga 1320tgtttctaaa aggccagggt taagtgcaaa tctttccagc tccctgcttg cccatactat 1380gtgttttaac atataatgtt tctttccccc tggccttaac ctaatttttt cccatttatc 1440taatttttcc cctcttaata ttgacgctct cgcacccat 1479 20 1479 RNA Humanimmunodeficiency virus type 1 20 uuauugagac aaggggucgc ugccaaagagugauuugagg gaaguuaagg guucccuuuc 60 uaucggcucc uguuucggag cggggguugucucuucgaac cugaagcucu cugcuggugg 120 ggcuguuggc ucuggucugu ucugaaggaaauucccuggc cuccccuugu gggaaggcca 180 aauuuucccu aaaaaauuag ccugccuuucaguacagucu uucauuuggu guccuuccuu 240 uccacauuuc caacagccuu uuuuccuaggggcucugcaa uuucuggcua ugugcccuuc 300 cuugccacag uuaaaacauu uaacaauucuucuagggccu uuaaaauugc uucucugcau 360 cauuauguuu ccacuguuug cuugacucauugccucagcc aauacucuug cuuugugacc 420 agguccuccc accccuugac augcugucaucauuucuucu aauguagccc cugguccuaa 480 ugcucucaaa auggucuuac aaucuggguucgcauuuugg acuaacaagg ugucugucau 540 ccaauuuuuu acuucuugug uagcuuguucagcucuuaaa guuuuaaaga accgaucuac 600 auagucucga aaggguuccu uuggcccuugucuuaugucc aaaaugcuga ccgggcuaua 660 cauucucacu auuuuauuua accccagaauuauccaucuu uuauagaugu cucccacugg 720 aauagguggg uuacuuguca uccaugcuauuuguuccuga aggguacuag uaguuccugc 780 uaugucacuu ccccuugguu cucucaucuggccuggugca auaggccccg cauggacugg 840 auguacucua ucccauucug cagccucuucauuaauagua ucuuuuaaca uuugcauggc 900 ugcuugaugu ccccccacug uauuuaacaugguguuuaaa ucuugugggg uggcuccuuc 960 ugauaaugcu guaaacaugg guauuaccucugggcuaaaa gccuuuucuu cuauuacuuu 1020 uacccaugca uucaagguuc uaggugauauggcuugaugu accauuugcc cuuggagauu 1080 cugcacuaua ggauaauuuu gacugacuuucccgucagcc gcuuuugccu gcugcguuuu 1140 uugcugacau uuguuuuguu cuuccucuaucuuaucuaag gcuuccuugg ugucucguac 1200 uucuaucuuu ucauguacac aauagagaguugcuacugug uuguauaaug auuuaaguuc 1260 cucuguuccu gucuggagag cugguuguagcuguuucauu auuuguuuac auccuucuga 1320 uguuucuaaa aggccagggu uaagugcaaaucuuuccagc ucccugcuug cccauacuau 1380 guguuuuaac auauaauguu ucuuucccccuggccuuaac cuaauuuuuu cccauuuauc 1440 uaauuuuucc ccucuuaaua uugacgcucucgcacccau 1479 21 2997 RNA Human immunodeficiency virus type 1 21uuuagggaaa auuuggccuu cccacaaggg gaggccaggg aauuuccuuc agaacagacc 60agagccaaca gccccaccag cagagagcuu cagguucgaa gaaacaaccc ccgcuccgaa 120acaggagccg agagaaaggg aacccuuaac uucccucaaa ucacucuuug gcagcgaccc 180cuugucucaa uaaaaauagg gggccagaca agggaggcuc ucuuagacac aggagcagau 240gauacaguau uagaagacau aaauuugcca ggaaaaugga aaccaaaaau gauaggagga 300auuggagguu uuaucaaagu aagacaguau gaucaaauac uuauagaaau uuguggaaaa 360aaggcuauag guacaguauu aguagggccu acaccuguca acauaauugg cagaaacaug 420uugacucagc uuggaugcac acuaaacuuu ccaaucaguc ccauugaaac uguaccagua 480aaacugaagc caggaaugga uggcccaaag guuaaacaau ggccguuaac agaagagaaa 540auaaaagcau uaacagcaau uugugaagaa auggaaaagg aaggaaaaau uacaaaaauu 600gggccugaaa auccauauaa cacuccaaua uuugccauaa aaaagaaaga cagcacuaag 660uggagaaaau uaguagauuu cagggaacuc aauaaaagaa cucaagacuu uugggagguu 720caauuaggaa uaccacaccc agcaggguua aaaaagaaaa aaucagugac aguacuggau 780gugggagaug cauauuuuuc aguuccuuua gaugaaggcu ucaggaaaua uacugcauuc 840accauaccua guauaaacaa ugaaacacca gggauuagau aucaauauaa ugugcuucca 900caaggaugga aagggucacc agcaauauuc caggguagca ugacaaaaau cuuagagccc 960uuuagagcuc aaaauccaga aauagucauc uaucaauaua uggaugacuu guauguagga 1020ucugacuuag aaauagggca acauagagca aaaauagaag aguuaagaga acaucuauua 1080aaguggggau uuaccacacc agacaaaaaa caucagaaag aacccccauu ucuuuggaug 1140ggguaugaac uccauccuga caaauggaca guacagccua uacagcugcc agaaaaggau 1200agcuggacug ucaaugauau acagaaguua gugggaaaau uaaacugggc aagucagauu 1260uacccaggga uuaaaguaag gcaacuuugu aagcuccuua gggggaccaa agcacuaaca 1320gacauaguac cacuaacuga agaagcagaa uuagaauugg cagagaacag ggaaauucua 1380aaagaaccag ugcauggagu auauuaugac ccaucaaaag acuugauagc ugaaauacag 1440aaacaggggg augaccaaug gacauaucaa auuuaccaag aaccauucaa aaaccugaag 1500acaggaaagu augcaaaaag gaggacuacc cacacuaaug auguaaaaca guuaacagag 1560gcagugcaaa aaauauccuu ggaaagcaua guaauauggg gaaagacucc uaaauuuaga 1620cuacccaucc aaaaagaaac augggaaaua ugguggacag acuauuggca agccacaugg 1680auuccugagu gggaguuugu uaauaccccu ccccuaguaa aacuauggua ccagcuagaa 1740aaagaaccca uagcaggagc agaaacuuuc uauguagaug gagcagcuaa uagggaaacu 1800aaaauaggaa aagcggggua uguuacugac agaggaaggc agaaaauugu aacucuaagu 1860gaaacaacaa aucagaagac ugaauuacaa gcaauucagc uagcuuugca agauucagaa 1920ucagaaguaa acauaauaac agacucacag uacgcauuag gaaucauuca agcacaacca 1980gauaggagug aaucagaguu ggucaaucaa auaauagaac aauuaauaaa aaaggaaagg 2040gucuaucugu cauggguacc agcacacaac ggacuugcag gaaaugaaca uguagauaaa 2100uuaguaagua ggggaaucag gaaagugcug guucuagaug gaauagauaa ggcucaugaa 2160gagcaugaaa aguaucacag caauuggaga gcaauggcua gugaguuuaa ucugccaccc 2220guaguagcaa gagaaauagu agccagcugu gauaaauguc agcuaaaagg ggaagccaua 2280cauggacaag uagauuguag uccggggaua uggcaauuag auuguacaca uuuagaagga 2340aaaaucaucc ugguagcagu ccauguagcc aguggcuaca uagaagcaga gguuauccca 2400gcagaaacag gacaagaaac agcauacuau auacuaaaau uagcaggaag auggccaguc 2460aaaguaauac auacagacaa uggcaguaau uucaccagug cugcaguuaa ggcagccugu 2520uggugggcag guauccaaca ggaauuuggg auucccuaca auccccaaag ucagggagua 2580guagaaucca ugaauaaaga auuaaagaaa aucauagggc agguaagaga ucaagcugag 2640caccuuaaga cagcaguaca aauggcagua uucauucaca auuuuaaaag aaaagggggg 2700auuggggggu acagugcagg ggaaagaaua auagacauaa uagcaacaga cauacaaacu 2760aaagaauuac aaaaacaaau uauaaaaauu caaaauuuuc ggguuuauua cagagacagc 2820agagauccua uuuggaaagg accagccaag cuacucugga aaggugaagg ggcaguagua 2880auacaagaca acagugacau aaagguagua ccaaggagga aaguaaaaau cauuagggac 2940uauggaaaac agauggcagg ugcugauugu guggcaggua gacaggauga agauuag 2997 222997 DNA Human immunodeficiency virus type 1 22 ctaatcttca tcctgtctacctgccacaca atcagcacct gccatctgtt ttccatagtc 60 cctaatgatt tttactttcctccttggtac tacctttatg tcactgttgt cttgtattac 120 tactgcccct tcacctttccagagtagctt ggctggtcct ttccaaatag gatctctgct 180 gtctctgtaa taaacccgaaaattttgaat ttttataatt tgtttttgta attctttagt 240 ttgtatgtct gttgctattatgtctattat tctttcccct gcactgtacc ccccaatccc 300 cccttttctt ttaaaattgtgaatgaatac tgccatttgt actgctgtct taaggtgctc 360 agcttgatct cttacctgccctatgatttt ctttaattct ttattcatgg attctactac 420 tccctgactt tggggattgtagggaatccc aaattcctgt tggatacctg cccaccaaca 480 ggctgcctta actgcagcactggtgaaatt actgccattg tctgtatgta ttactttgac 540 tggccatctt cctgctaattttagtatata gtatgctgtt tcttgtcctg tttctgctgg 600 gataacctct gcttctatgtagccactggc tacatggact gctaccagga tgatttttcc 660 ttctaaatgt gtacaatctaattgccatat ccccggacta caatctactt gtccatgtat 720 ggcttcccct tttagctgacatttatcaca gctggctact atttctcttg ctactacggg 780 tggcagatta aactcactagccattgctct ccaattgctg tgatactttt catgctcttc 840 atgagcctta tctattccatctagaaccag cactttcctg attcccctac ttactaattt 900 atctacatgt tcatttcctgcaagtccgtt gtgtgctggt acccatgaca gatagaccct 960 ttcctttttt attaattgttctattatttg attgaccaac tctgattcac tcctatctgg 1020 ttgtgcttga atgattcctaatgcgtactg tgagtctgtt attatgttta cttctgattc 1080 tgaatcttgc aaagctagctgaattgcttg taattcagtc ttctgatttg ttgtttcact 1140 tagagttaca attttctgccttcctctgtc agtaacatac cccgcttttc ctattttagt 1200 ttccctatta gctgctccatctacatagaa agtttctgct cctgctatgg gttctttttc 1260 tagctggtac catagttttactaggggagg ggtattaaca aactcccact caggaatcca 1320 tgtggcttgc caatagtctgtccaccatat ttcccatgtt tctttttgga tgggtagtct 1380 aaatttagga gtctttccccatattactat gctttccaag gatatttttt gcactgcctc 1440 tgttaactgt tttacatcattagtgtgggt agtcctcctt tttgcatact ttcctgtctt 1500 caggtttttg aatggttcttggtaaatttg atatgtccat tggtcatccc cctgtttctg 1560 tatttcagct atcaagtcttttgatgggtc ataatatact ccatgcactg gttcttttag 1620 aatttccctg ttctctgccaattctaattc tgcttcttca gttagtggta ctatgtctgt 1680 tagtgctttg gtccccctaaggagcttaca aagttgcctt actttaatcc ctgggtaaat 1740 ctgacttgcc cagtttaattttcccactaa cttctgtata tcattgacag tccagctatc 1800 cttttctggc agctgtataggctgtactgt ccatttgtca ggatggagtt cataccccat 1860 ccaaagaaat gggggttctttctgatgttt tttgtctggt gtggtaaatc cccactttaa 1920 tagatgttct cttaactcttctatttttgc tctatgttgc cctatttcta agtcagatcc 1980 tacatacaag tcatccatatattgatagat gactatttct ggattttgag ctctaaaggg 2040 ctctaagatt tttgtcatgctaccctggaa tattgctggt gaccctttcc atccttgtgg 2100 aagcacatta tattgatatctaatccctgg tgtttcattg tttatactag gtatggtgaa 2160 tgcagtatat ttcctgaagccttcatctaa aggaactgaa aaatatgcat ctcccacatc 2220 cagtactgtc actgattttttcttttttaa ccctgctggg tgtggtattc ctaattgaac 2280 ctcccaaaag tcttgagttcttttattgag ttccctgaaa tctactaatt ttctccactt 2340 agtgctgtct ttcttttttatggcaaatat tggagtgtta tatggatttt caggcccaat 2400 ttttgtaatt tttccttccttttccatttc ttcacaaatt gctgttaatg cttttatttt 2460 ctcttctgtt aacggccattgtttaacctt tgggccatcc attcctggct tcagttttac 2520 tggtacagtt tcaatgggactgattggaaa gtttagtgtg catccaagct gagtcaacat 2580 gtttctgcca attatgttgacaggtgtagg ccctactaat actgtaccta tagccttttt 2640 tccacaaatt tctataagtatttgatcata ctgtcttact ttgataaaac ctccaattcc 2700 tcctatcatt tttggtttccattttcctgg caaatttatg tcttctaata ctgtatcatc 2760 tgctcctgtg tctaagagagcctcccttgt ctggccccct atttttattg agacaagggg 2820 tcgctgccaa agagtgatttgagggaagtt aagggttccc tttctctcgg ctcctgtttc 2880 ggagcggggg ttgtttcttcgaacctgaag ctctctgctg gtggggctgt tggctctggt 2940 ctgttctgaa ggaaattccctggcctcccc ttgtgggaag gccaaatttt ccctaaa 2997 23 2997 RNA Humanimmunodeficiency virus type 1 23 cuaaucuuca uccugucuac cugccacacaaucagcaccu gccaucuguu uuccauaguc 60 ccuaaugauu uuuacuuucc uccuugguacuaccuuuaug ucacuguugu cuuguauuac 120 uacugccccu ucaccuuucc agaguagcuuggcugguccu uuccaaauag gaucucugcu 180 gucucuguaa uaaacccgaa aauuuugaauuuuuauaauu uguuuuugua auucuuuagu 240 uuguaugucu guugcuauua ugucuauuauucuuuccccu gcacuguacc ccccaauccc 300 cccuuuucuu uuaaaauugu gaaugaauacugccauuugu acugcugucu uaaggugcuc 360 agcuugaucu cuuaccugcc cuaugauuuucuuuaauucu uuauucaugg auucuacuac 420 ucccugacuu uggggauugu agggaaucccaaauuccugu uggauaccug cccaccaaca 480 ggcugccuua acugcagcac uggugaaauuacugccauug ucuguaugua uuacuuugac 540 uggccaucuu ccugcuaauu uuaguauauaguaugcuguu ucuuguccug uuucugcugg 600 gauaaccucu gcuucuaugu agccacuggcuacauggacu gcuaccagga ugauuuuucc 660 uucuaaaugu guacaaucua auugccauauccccggacua caaucuacuu guccauguau 720 ggcuuccccu uuuagcugac auuuaucacagcuggcuacu auuucucuug cuacuacggg 780 uggcagauua aacucacuag ccauugcucuccaauugcug ugauacuuuu caugcucuuc 840 augagccuua ucuauuccau cuagaaccagcacuuuccug auuccccuac uuacuaauuu 900 aucuacaugu ucauuuccug caaguccguugugugcuggu acccaugaca gauagacccu 960 uuccuuuuuu auuaauuguu cuauuauuugauugaccaac ucugauucac uccuaucugg 1020 uugugcuuga augauuccua augcguacugugagucuguu auuauguuua cuucugauuc 1080 ugaaucuugc aaagcuagcu gaauugcuuguaauucaguc uucugauuug uuguuucacu 1140 uagaguuaca auuuucugcc uuccucugucaguaacauac cccgcuuuuc cuauuuuagu 1200 uucccuauua gcugcuccau cuacauagaaaguuucugcu ccugcuaugg guucuuuuuc 1260 uagcugguac cauaguuuua cuaggggagggguauuaaca aacucccacu caggaaucca 1320 uguggcuugc caauagucug uccaccauauuucccauguu ucuuuuugga uggguagucu 1380 aaauuuagga gucuuucccc auauuacuaugcuuuccaag gauauuuuuu gcacugccuc 1440 uguuaacugu uuuacaucau uaguguggguaguccuccuu uuugcauacu uuccugucuu 1500 cagguuuuug aaugguucuu gguaaauuugauauguccau uggucauccc ccuguuucug 1560 uauuucagcu aucaagucuu uugaugggucauaauauacu ccaugcacug guucuuuuag 1620 aauuucccug uucucugcca auucuaauucugcuucuuca guuaguggua cuaugucugu 1680 uagugcuuug gucccccuaa ggagcuuacaaaguugccuu acuuuaaucc cuggguaaau 1740 cugacuugcc caguuuaauu uucccacuaacuucuguaua ucauugacag uccagcuauc 1800 cuuuucuggc agcuguauag gcuguacuguccauuuguca ggauggaguu cauaccccau 1860 ccaaagaaau ggggguucuu ucugauguuuuuugucuggu gugguaaauc cccacuuuaa 1920 uagauguucu cuuaacucuu cuauuuuugcucuauguugc ccuauuucua agucagaucc 1980 uacauacaag ucauccauau auugauagaugacuauuucu ggauuuugag cucuaaaggg 2040 cucuaagauu uuugucaugc uacccuggaauauugcuggu gacccuuucc auccuugugg 2100 aagcacauua uauugauauc uaaucccugguguuucauug uuuauacuag guauggugaa 2160 ugcaguauau uuccugaagc cuucaucuaaaggaacugaa aaauaugcau cucccacauc 2220 caguacuguc acugauuuuu ucuuuuuuaacccugcuggg ugugguauuc cuaauugaac 2280 cucccaaaag ucuugaguuc uuuuauugaguucccugaaa ucuacuaauu uucuccacuu 2340 agugcugucu uucuuuuuua uggcaaauauuggaguguua uauggauuuu caggcccaau 2400 uuuuguaauu uuuccuuccu uuuccauuucuucacaaauu gcuguuaaug cuuuuauuuu 2460 cucuucuguu aacggccauu guuuaaccuuugggccaucc auuccuggcu ucaguuuuac 2520 ugguacaguu ucaaugggac ugauuggaaaguuuagugug cauccaagcu gagucaacau 2580 guuucugcca auuauguuga cagguguaggcccuacuaau acuguaccua uagccuuuuu 2640 uccacaaauu ucuauaagua uuugaucauacugucuuacu uugauaaaac cuccaauucc 2700 uccuaucauu uuugguuucc auuuuccuggcaaauuuaug ucuucuaaua cuguaucauc 2760 ugcuccugug ucuaagagag ccucccuugucuggcccccu auuuuuauug agacaagggg 2820 ucgcugccaa agagugauuu gagggaaguuaaggguuccc uuucucucgg cuccuguuuc 2880 ggagcggggg uuguuucuuc gaaccugaagcucucugcug guggggcugu uggcucuggu 2940 cuguucugaa ggaaauuccc uggccuccccuugugggaag gccaaauuuu cccuaaa 2997 24 2535 RNA Human immunodeficiencyvirus type 1 24 augagaguga uggggauaca gaggaauugg ccacaauggu ggauauggggcaccuuaggc 60 uuuuggauga uaauaauuug uaggguggug gggaacuuga acuugugggucacagucuau 120 uaugggguac cuguguggaa agaagcaaaa acuacucuau ucugugcaucagaugcuaaa 180 gcauaugaua aagaaguaca uaaugucugg gcuacacaug ccuguguacccacagacccc 240 aacccacgag aaauaguuuu ggaaaaugua acagaaaauu uuaacauguggaaaaaugac 300 augguggauc agaugcauga ggauauaauc aguuuauggg aucaaagccuaaaaccaugu 360 guaaaguuga ccccacucug ugucacuuua aauuguacaa augcaccugccuacaauaau 420 agcaugcaug gagaaaugaa aaauugcucu uucaauacaa ccacagagauaagagauagg 480 aaacagaaag cguaugcacu uuuuuauaaa ccugauguag ugccacuuaauaggagagaa 540 gagaauaaug ggacaggaga guauauauua auaaauugca auuccucaaccauaacacaa 600 gccuguccaa aggucacuuu ugacccaauu ccuauacauu auugugcuccagcugguuau 660 gcgauucuaa aguguaauaa uaagacauuc aaugggacag gaccaugcaauaaugucagc 720 acaguacaau guacacaugg aauuaugcca gugguaucaa cucaauuacuguuaaauggu 780 agccuagcag aagaagagau aauaauuaga ucugaaaauc ugacaaacaauaucaaaaca 840 auaauagucc accuuaauaa aucuguagaa auugugugua caagacccaacaauaauaca 900 agaaaaagua uaaggauagg accaggacaa acauucuaug caacaggugaaauaauagga 960 aacauaagag aagcacauug uaacauuagu aaaaguaacu ggaccaguacuuuagaacag 1020 guaaagaaaa aauuaaaaga acacuacaau aagacaauag aauuuaacccacccucagga 1080 ggggaucuag aaguuacaac acauagcuuu aauuguagag gagaauuuuucuauugcaau 1140 acaacaaaac uguuuucaaa caacagugau ucaaacaacg aaaccaucacacucccaugc 1200 aagauaaaac aaauuauaaa cauguggcag aagguaggac gagcaauguaugccccuccc 1260 auugaaggaa acauaacaug uaaaucaaau aucacaggac uacuauugacacgugaugga 1320 ggaaagaaua caacaaauga gauauucaga ccgggaggag gaaauaugaaggacaauugg 1380 agaagugaau uauauaaaua uaaaguggua gaaauugagc cauugggaguagcacccacu 1440 aaaucaaaaa ggagaguggu ggagagagaa aaaagagcag ugggacuaggagcuguacuc 1500 cuuggguucu ugggagcagc aggaagcacu augggcgcgg cgucaauaacgcugacggua 1560 caggccagac aacuguuguc ugguauagug caacagcaaa gcaauuugcugagagcuaua 1620 gaggcgcaac agcauauguu gcaacucacg gucuggggca uuaagcagcuccagacaaga 1680 gucuuggcua uagagagaua ccuaaaggau caacagcucc uagggcuuuggggcugcucu 1740 ggaaaaauca ucugcaccac ugcugugccu uggaacucca guuggaguaauaaaucucaa 1800 gaagauauuu gggauaacau gaccuggaug cagugggaua gagaaauuaguaauuacaca 1860 ggcacaauau auagguuacu ugaagacucg caaaaccagc aggagaaaaaugaaaaagau 1920 uuauuagcau uggacaguug gaaaaacuug uggaauuggu uuaacauaacaaauuggcug 1980 ugguauauaa aaauauucau caugauagua ggaggcuuga uagguuugagaauaauuuuu 2040 gguguacucg cuauagugaa aagaguuagg cagggauacu caccuuugucguuucagacc 2100 cuuaccccaa gcccgagggg ucccgacagg cucggaagaa ucgaagaagaagguggagag 2160 caagacaaag acagauccau ucgauuagug agcggauucu uagcacuugccugggacgau 2220 cugcggagcc ugugccucuu cagcuaccac cacuugagag acuucauauugauugcagcg 2280 agagcagcgg aacuucuggg acgcagcagu cucaggggac ugcagagagggugggaagcc 2340 cuuaaguauc ugggaaaucu ugugcaguau gggggucugg agcuaaaaagaagugcuauu 2400 aaacuguuug auaccauagc aauagcagua gcugaaggaa cagauaggauucuugaagua 2460 auacagagaa uuuguagagc uauccgccac auaccuauaa gaauaagacagggcuuugaa 2520 gcagcuuugc aauaa 2535 25 2535 DNA Human immunodeficiencyvirus type 1 25 ttattgcaaa gctgcttcaa agccctgtct tattcttata ggtatgtggcggatagctct 60 acaaattctc tgtattactt caagaatcct atctgttcct tcagctactgctattgctat 120 ggtatcaaac agtttaatag cacttctttt tagctccaga cccccatactgcacaagatt 180 tcccagatac ttaagggctt cccaccctct ctgcagtccc ctgagactgctgcgtcccag 240 aagttccgct gctctcgctg caatcaatat gaagtctctc aagtggtggtagctgaagag 300 gcacaggctc cgcagatcgt cccaggcaag tgctaagaat ccgctcactaatcgaatgga 360 tctgtctttg tcttgctctc caccttcttc ttcgattctt ccgagcctgtcgggacccct 420 cgggcttggg gtaagggtct gaaacgacaa aggtgagtat ccctgcctaactcttttcac 480 tatagcgagt acaccaaaaa ttattctcaa acctatcaag cctcctactatcatgatgaa 540 tatttttata taccacagcc aatttgttat gttaaaccaa ttccacaagtttttccaact 600 gtccaatgct aataaatctt tttcattttt ctcctgctgg ttttgcgagtcttcaagtaa 660 cctatatatt gtgcctgtgt aattactaat ttctctatcc cactgcatccaggtcatgtt 720 atcccaaata tcttcttgag atttattact ccaactggag ttccaaggcacagcagtggt 780 gcagatgatt tttccagagc agccccaaag ccctaggagc tgttgatcctttaggtatct 840 ctctatagcc aagactcttg tctggagctg cttaatgccc cagaccgtgagttgcaacat 900 atgctgttgc gcctctatag ctctcagcaa attgctttgc tgttgcactataccagacaa 960 cagttgtctg gcctgtaccg tcagcgttat tgacgccgcg cccatagtgcttcctgctgc 1020 tcccaagaac ccaaggagta cagctcctag tcccactgct cttttttctctctccaccac 1080 tctccttttt gatttagtgg gtgctactcc caatggctca atttctaccactttatattt 1140 atataattca cttctccaat tgtccttcat atttcctcct cccggtctgaatatctcatt 1200 tgttgtattc tttcctccat cacgtgtcaa tagtagtcct gtgatatttgatttacatgt 1260 tatgtttcct tcaatgggag gggcatacat tgctcgtcct accttctgccacatgtttat 1320 aatttgtttt atcttgcatg ggagtgtgat ggtttcgttg tttgaatcactgttgtttga 1380 aaacagtttt gttgtattgc aatagaaaaa ttctcctcta caattaaagctatgtgttgt 1440 aacttctaga tcccctcctg agggtgggtt aaattctatt gtcttattgtagtgttcttt 1500 taattttttc tttacctgtt ctaaagtact ggtccagtta cttttactaatgttacaatg 1560 tgcttctctt atgtttccta ttatttcacc tgttgcatag aatgtttgtcctggtcctat 1620 ccttatactt tttcttgtat tattgttggg tcttgtacac acaatttctacagatttatt 1680 aaggtggact attattgttt tgatattgtt tgtcagattt tcagatctaattattatctc 1740 ttcttctgct aggctaccat ttaacagtaa ttgagttgat accactggcataattccatg 1800 tgtacattgt actgtgctga cattattgca tggtcctgtc ccattgaatgtcttattatt 1860 acactttaga atcgcataac cagctggagc acaataatgt ataggaattgggtcaaaagt 1920 gacctttgga caggcttgtg ttatggttga ggaattgcaa tttattaatatatactctcc 1980 tgtcccatta ttctcttctc tcctattaag tggcactaca tcaggtttataaaaaagtgc 2040 atacgctttc tgtttcctat ctcttatctc tgtggttgta ttgaaagagcaatttttcat 2100 ttctccatgc atgctattat tgtaggcagg tgcatttgta caatttaaagtgacacagag 2160 tggggtcaac tttacacatg gttttaggct ttgatcccat aaactgattatatcctcatg 2220 catctgatcc accatgtcat ttttccacat gttaaaattt tctgttacattttccaaaac 2280 tatttctcgt gggttggggt ctgtgggtac acaggcatgt gtagcccagacattatgtac 2340 ttctttatca tatgctttag catctgatgc acagaataga gtagtttttgcttctttcca 2400 cacaggtacc ccataataga ctgtgaccca caagttcaag ttccccaccaccctacaaat 2460 tattatcatc caaaagccta aggtgcccca tatccaccat tgtggccaattcctctgtat 2520 ccccatcact ctcat 2535 26 2535 RNA Human immunodeficiencyvirus type 1 26 uuauugcaaa gcugcuucaa agcccugucu uauucuuaua gguauguggcggauagcucu 60 acaaauucuc uguauuacuu caagaauccu aucuguuccu ucagcuacugcuauugcuau 120 gguaucaaac aguuuaauag cacuucuuuu uagcuccaga cccccauacugcacaagauu 180 ucccagauac uuaagggcuu cccacccucu cugcaguccc cugagacugcugcgucccag 240 aaguuccgcu gcucucgcug caaucaauau gaagucucuc aaguggugguagcugaagag 300 gcacaggcuc cgcagaucgu cccaggcaag ugcuaagaau ccgcucacuaaucgaaugga 360 ucugucuuug ucuugcucuc caccuucuuc uucgauucuu ccgagccugucgggaccccu 420 cgggcuuggg guaagggucu gaaacgacaa aggugaguau cccugccuaacucuuuucac 480 uauagcgagu acaccaaaaa uuauucucaa accuaucaag ccuccuacuaucaugaugaa 540 uauuuuuaua uaccacagcc aauuuguuau guuaaaccaa uuccacaaguuuuuccaacu 600 guccaaugcu aauaaaucuu uuucauuuuu cuccugcugg uuuugcgagucuucaaguaa 660 ccuauauauu gugccugugu aauuacuaau uucucuaucc cacugcauccaggucauguu 720 aucccaaaua ucuucuugag auuuauuacu ccaacuggag uuccaaggcacagcaguggu 780 gcagaugauu uuuccagagc agccccaaag cccuaggagc uguugauccuuuagguaucu 840 cucuauagcc aagacucuug ucuggagcug cuuaaugccc cagaccgugaguugcaacau 900 augcuguugc gccucuauag cucucagcaa auugcuuugc uguugcacuauaccagacaa 960 caguugucug gccuguaccg ucagcguuau ugacgccgcg cccauagugcuuccugcugc 1020 ucccaagaac ccaaggagua cagcuccuag ucccacugcu cuuuuuucucucuccaccac 1080 ucuccuuuuu gauuuagugg gugcuacucc caauggcuca auuucuaccacuuuauauuu 1140 auauaauuca cuucuccaau uguccuucau auuuccuccu cccggucugaauaucucauu 1200 uguuguauuc uuuccuccau cacgugucaa uaguaguccu gugauauuugauuuacaugu 1260 uauguuuccu ucaaugggag gggcauacau ugcucguccu accuucugccacauguuuau 1320 aauuuguuuu aucuugcaug ggagugugau gguuucguug uuugaaucacuguuguuuga 1380 aaacaguuuu guuguauugc aauagaaaaa uucuccucua caauuaaagcuauguguugu 1440 aacuucuaga uccccuccug aggguggguu aaauucuauu gucuuauuguaguguucuuu 1500 uaauuuuuuc uuuaccuguu cuaaaguacu gguccaguua cuuuuacuaauguuacaaug 1560 ugcuucucuu auguuuccua uuauuucacc uguugcauag aauguuuguccugguccuau 1620 ccuuauacuu uuucuuguau uauuguuggg ucuuguacac acaauuucuacagauuuauu 1680 aagguggacu auuauuguuu ugauauuguu ugucagauuu ucagaucuaauuauuaucuc 1740 uucuucugcu aggcuaccau uuaacaguaa uugaguugau accacuggcauaauuccaug 1800 uguacauugu acugugcuga cauuauugca ugguccuguc ccauugaaugucuuauuauu 1860 acacuuuaga aucgcauaac cagcuggagc acaauaaugu auaggaauugggucaaaagu 1920 gaccuuugga caggcuugug uuaugguuga ggaauugcaa uuuauuaauauauacucucc 1980 ugucccauua uucucuucuc uccuauuaag uggcacuaca ucagguuuauaaaaaagugc 2040 auacgcuuuc uguuuccuau cucuuaucuc ugugguugua uugaaagagcaauuuuucau 2100 uucuccaugc augcuauuau uguaggcagg ugcauuugua caauuuaaagugacacagag 2160 uggggucaac uuuacacaug guuuuaggcu uugaucccau aaacugauuauauccucaug 2220 caucugaucc accaugucau uuuuccacau guuaaaauuu ucuguuacauuuuccaaaac 2280 uauuucucgu ggguuggggu cuguggguac acaggcaugu guagcccagacauuauguac 2340 uucuuuauca uaugcuuuag caucugaugc acagaauaga guaguuuuugcuucuuucca 2400 cacagguacc ccauaauaga cugugaccca caaguucaag uuccccaccacccuacaaau 2460 uauuaucauc caaaagccua aggugcccca uauccaccau uguggccaauuccucuguau 2520 ccccaucacu cucau 2535 27 2579 RNA Human immunodeficiencyvirus type 1 27 aggcuaauuu uuuagggaaa auuuggccuu cccacaaggg gaggccagggaauuuccuuc 60 agagcaggcc aaugagagug agggggauac agaggaauug gccacaaugguggauauggg 120 gcaucuuagg cuuuuggaug uuaaugauuu guaguggggu gggaaacuugugggucacaa 180 ucuauuaugg gguaccugug uggagagaag caaaaacuac ucuauucugugcaucagaug 240 cuaaagcaua ugauagagaa gugcauaaug ucugggcuac acaugccuguguacccacag 300 accccaaccc acaagaaaua guuaugggaa auguaacaga aaauuuuaacauguggaaaa 360 augacauggu ggaucagaug caugaggaua uaaucaauuu augggaucaaagccuaaagc 420 cauguguaaa guuaacccca cucuguguca cuuuaaaaug uaguaccuauaaugguagug 480 auaccaacga uaugagaaau ugcucuuuca auacaacuac agaaauaagggacaagaaac 540 agacagugua ugcacuuuuu uauaaaccug auauaguacc aauuaaugagagugaguaua 600 uauuaauaca uugcaauacc ucaaccauaa cacaagccug uccaaaggucucuuuugacc 660 caauuccuau acauuauugu gcuccagcug guuaugcgau ucuaaaguguaauaauaaga 720 cauucaaugg gacgggacca ugccaaaaug ucagcacagu acaaugcacacauggaauua 780 agccaguagu aucaacucaa cuacuguuaa augguagcau agcagaaggagagauaauaa 840 uuagaucuga aaaucugaca aacaauguua aaacaauaau aguacaccuuaaugaaucua 900 uaggaauugu guguacaaga cccggcaaua auacaagaaa aaguauaaggauaggaccag 960 gacaagcauu cuauacaaau cacauaauag gagauauaag acaagcauauuguaacauua 1020 guaaacaaga auggaacaaa acuuuagaag aggugagaaa aaaauugcaagaacacuucc 1080 caaauaaaac aauaaaauuu aacucauccu caggagggga ccuagaaauuacaacacaua 1140 gcuuuaauug cagaggagaa uuuuucuauu gcaauacauc aaaacuauuuaaugauaguc 1200 uaguaaauga uacagaaagu aauucaacca ucacuauucc augcagaauaaaacaaauua 1260 uaaacaugug gcaggaggua ggacgagcaa uguaugcccc ucccauugcaggaaacauaa 1320 cauguaaauc aaauaucaca ggacuacuau ugacacguga uggaggaacagauaacacaa 1380 cagagauauu cagaccugga ggaggaaaua ugaaggacaa uuggagaagugaauuauaua 1440 aauauaaagu aguagaaauu aagccauugg gaauagcacc cacugaagcaaaaaggagag 1500 ugguggagag agaaaaaaga gcagugggaa uaggagcugu gcuccuuggguucuugggag 1560 cagcaggaag cacuaugggc gcggcgucaa uaacgcugac gguacaggccagacaacugu 1620 ugucugguau agugcaacag caaagcaauu ugcugagagc uauagaggcgcaacagcaua 1680 uguugcaacu cacagucugg ggcauuaagc agcuccagac aagaguccuggcuauagaaa 1740 gauaccuaaa ggaucaacag cuccuaggac uuuggggcug cucuggaaaacucaucugca 1800 ccacuaaugu gccuuggaac uccaguugga gcaauaaauc ucaacaagcuauuugggaua 1860 acaugacaug gaugcagugg gauagagaaa uuaauaauua cacaaacauaauauaccagu 1920 ugcuugagga cucgcaaauc cagcaggaac agaaugaaaa agauuuauuagcauuggaca 1980 aguggcaaaa ucuguggagu ugguuuagca uaacaaauug gcuaugguauauaaaaauau 2040 ucauaaugau aguaggaggc uuaauagguu uaagaauaau uuuugcugugcuaucuauag 2100 uaaauagagu uaggcaggga uacucaccuu ugucguuuca gacccuuaccccaaacccga 2160 ggggacccga caggcucgga gaaaucgaag aagaaggugg agagcaagacagagacagau 2220 ccguucgauu agugagcgga uucuuaccac uugccuggga cgaucugcggagccugugcc 2280 ucuucagcua ccaccgauug agagacuuca uauucgauug cagcgaggacaguggaacuu 2340 cugggacgca gcagucucag gggacuccag agggguggga aguccuuaaauaucugggaa 2400 gccuugugca guauuggggu cuggagcuaa aaagagugcu auuagucugcuugauaccca 2460 uagcaauagc aguagcugaa ggaacagaua ggauuauuga auuaguacuaagauuuugua 2520 gagcuauccg caacauaccu acaagaguaa gacagggcug ugaagcagcuuugcuauaa 2579 28 2579 DNA Human immunodeficiency virus type 1 28ttatagcaaa gctgcttcac agccctgtct tactcttgta ggtatgttgc ggatagctct 60acaaaatctt agtactaatt caataatcct atctgttcct tcagctactg ctattgctat 120gggtatcaag cagactaata gcactctttt tagctccaga ccccaatact gcacaaggct 180tcccagatat ttaaggactt cccacccctc tggagtcccc tgagactgct gcgtcccaga 240agttccactg tcctcgctgc aatcgaatat gaagtctctc aatcggtggt agctgaagag 300gcacaggctc cgcagatcgt cccaggcaag tggtaagaat ccgctcacta atcgaacgga 360tctgtctctg tcttgctctc caccttcttc ttcgatttct ccgagcctgt cgggtcccct 420cgggtttggg gtaagggtct gaaacgacaa aggtgagtat ccctgcctaa ctctatttac 480tatagatagc acagcaaaaa ttattcttaa acctattaag cctcctacta tcattatgaa 540tatttttata taccatagcc aatttgttat gctaaaccaa ctccacagat tttgccactt 600gtccaatgct aataaatctt tttcattctg ttcctgctgg atttgcgagt cctcaagcaa 660ctggtatatt atgtttgtgt aattattaat ttctctatcc cactgcatcc atgtcatgtt 720atcccaaata gcttgttgag atttattgct ccaactggag ttccaaggca cattagtggt 780gcagatgagt tttccagagc agccccaaag tcctaggagc tgttgatcct ttaggtatct 840ttctatagcc aggactcttg tctggagctg cttaatgccc cagactgtga gttgcaacat 900atgctgttgc gcctctatag ctctcagcaa attgctttgc tgttgcacta taccagacaa 960cagttgtctg gcctgtaccg tcagcgttat tgacgccgcg cccatagtgc ttcctgctgc 1020tcccaagaac ccaaggagca cagctcctat tcccactgct cttttttctc tctccaccac 1080tctccttttt gcttcagtgg gtgctattcc caatggctta atttctacta ctttatattt 1140atataattca cttctccaat tgtccttcat atttcctcct ccaggtctga atatctctgt 1200tgtgttatct gttcctccat cacgtgtcaa tagtagtcct gtgatatttg atttacatgt 1260tatgtttcct gcaatgggag gggcatacat tgctcgtcct acctcctgcc acatgtttat 1320aatttgtttt attctgcatg gaatagtgat ggttgaatta ctttctgtat catttactag 1380actatcatta aatagttttg atgtattgca atagaaaaat tctcctctgc aattaaagct 1440atgtgttgta atttctaggt cccctcctga ggatgagtta aattttattg ttttatttgg 1500gaagtgttct tgcaattttt ttctcacctc ttctaaagtt ttgttccatt cttgtttact 1560aatgttacaa tatgcttgtc ttatatctcc tattatgtga tttgtataga atgcttgtcc 1620tggtcctatc cttatacttt ttcttgtatt attgccgggt cttgtacaca caattcctat 1680agattcatta aggtgtacta ttattgtttt aacattgttt gtcagatttt cagatctaat 1740tattatctct ccttctgcta tgctaccatt taacagtagt tgagttgata ctactggctt 1800aattccatgt gtgcattgta ctgtgctgac attttggcat ggtcccgtcc cattgaatgt 1860cttattatta cactttagaa tcgcataacc agctggagca caataatgta taggaattgg 1920gtcaaaagag acctttggac aggcttgtgt tatggttgag gtattgcaat gtattaatat 1980atactcactc tcattaattg gtactatatc aggtttataa aaaagtgcat acactgtctg 2040tttcttgtcc cttatttctg tagttgtatt gaaagagcaa tttctcatat cgttggtatc 2100actaccatta taggtactac attttaaagt gacacagagt ggggttaact ttacacatgg 2160ctttaggctt tgatcccata aattgattat atcctcatgc atctgatcca ccatgtcatt 2220tttccacatg ttaaaatttt ctgttacatt tcccataact atttcttgtg ggttggggtc 2280tgtgggtaca caggcatgtg tagcccagac attatgcact tctctatcat atgctttagc 2340atctgatgca cagaatagag tagtttttgc ttctctccac acaggtaccc cataatagat 2400tgtgacccac aagtttccca ccccactaca aatcattaac atccaaaagc ctaagatgcc 2460ccatatccac cattgtggcc aattcctctg tatccccctc actctcattg gcctgctctg 2520aaggaaattc cctggcctcc ccttgtggga aggccaaatt ttccctaaaa aattagcct 2579 292579 RNA Human immunodeficiency virus type 1 29 uuauagcaaa gcugcuucacagcccugucu uacucuugua gguauguugc ggauagcucu 60 acaaaaucuu aguacuaauucaauaauccu aucuguuccu ucagcuacug cuauugcuau 120 ggguaucaag cagacuaauagcacucuuuu uagcuccaga ccccaauacu gcacaaggcu 180 ucccagauau uuaaggacuucccaccccuc uggagucccc ugagacugcu gcgucccaga 240 aguuccacug uccucgcugcaaucgaauau gaagucucuc aaucgguggu agcugaagag 300 gcacaggcuc cgcagaucgucccaggcaag ugguaagaau ccgcucacua aucgaacgga 360 ucugucucug ucuugcucuccaccuucuuc uucgauuucu ccgagccugu cggguccccu 420 cggguuuggg guaagggucugaaacgacaa aggugaguau cccugccuaa cucuauuuac 480 uauagauagc acagcaaaaauuauucuuaa accuauuaag ccuccuacua ucauuaugaa 540 uauuuuuaua uaccauagccaauuuguuau gcuaaaccaa cuccacagau uuugccacuu 600 guccaaugcu aauaaaucuuuuucauucug uuccugcugg auuugcgagu ccucaagcaa 660 cugguauauu auguuuguguaauuauuaau uucucuaucc cacugcaucc augucauguu 720 aucccaaaua gcuuguugagauuuauugcu ccaacuggag uuccaaggca cauuaguggu 780 gcagaugagu uuuccagagcagccccaaag uccuaggagc uguugauccu uuagguaucu 840 uucuauagcc aggacucuugucuggagcug cuuaaugccc cagacuguga guugcaacau 900 augcuguugc gccucuauagcucucagcaa auugcuuugc uguugcacua uaccagacaa 960 caguugucug gccuguaccgucagcguuau ugacgccgcg cccauagugc uuccugcugc 1020 ucccaagaac ccaaggagcacagcuccuau ucccacugcu cuuuuuucuc ucuccaccac 1080 ucuccuuuuu gcuucagugggugcuauucc caauggcuua auuucuacua cuuuauauuu 1140 auauaauuca cuucuccaauuguccuucau auuuccuccu ccaggucuga auaucucugu 1200 uguguuaucu guuccuccaucacgugucaa uaguaguccu gugauauuug auuuacaugu 1260 uauguuuccu gcaaugggaggggcauacau ugcucguccu accuccugcc acauguuuau 1320 aauuuguuuu auucugcauggaauagugau gguugaauua cuuucuguau cauuuacuag 1380 acuaucauua aauaguuuugauguauugca auagaaaaau ucuccucugc aauuaaagcu 1440 auguguugua auuucuagguccccuccuga ggaugaguua aauuuuauug uuuuauuugg 1500 gaaguguucu ugcaauuuuuuucucaccuc uucuaaaguu uuguuccauu cuuguuuacu 1560 aauguuacaa uaugcuugucuuauaucucc uauuauguga uuuguauaga augcuugucc 1620 ugguccuauc cuuauacuuuuucuuguauu auugccgggu cuuguacaca caauuccuau 1680 agauucauua agguguacuauuauuguuuu aacauuguuu gucagauuuu cagaucuaau 1740 uauuaucucu ccuucugcuaugcuaccauu uaacaguagu ugaguugaua cuacuggcuu 1800 aauuccaugu gugcauuguacugugcugac auuuuggcau ggucccgucc cauugaaugu 1860 cuuauuauua cacuuuagaaucgcauaacc agcuggagca caauaaugua uaggaauugg 1920 gucaaaagag accuuuggacaggcuugugu uaugguugag guauugcaau guauuaauau 1980 auacucacuc ucauuaauugguacuauauc agguuuauaa aaaagugcau acacugucug 2040 uuucuugucc cuuauuucuguaguuguauu gaaagagcaa uuucucauau cguugguauc 2100 acuaccauua uagguacuacauuuuaaagu gacacagagu gggguuaacu uuacacaugg 2160 cuuuaggcuu ugaucccauaaauugauuau auccucaugc aucugaucca ccaugucauu 2220 uuuccacaug uuaaaauuuucuguuacauu ucccauaacu auuucuugug gguugggguc 2280 uguggguaca caggcauguguagcccagac auuaugcacu ucucuaucau augcuuuagc 2340 aucugaugca cagaauagaguaguuuuugc uucucuccac acagguaccc cauaauagau 2400 ugugacccac aaguuucccaccccacuaca aaucauuaac auccaaaagc cuaagaugcc 2460 ccauauccac cauuguggccaauuccucug uaucccccuc acucucauug gccugcucug 2520 aaggaaauuc ccuggccuccccuuguggga aggccaaauu uucccuaaaa aauuagccu 2579 30 307 PRT Humanimmunodeficiency virus type 1 30 Gly Glu Lys Leu Asp Thr Trp Glu Lys IleArg Leu Arg Pro Gly Gly 1 5 10 15 Lys Lys His Tyr Met Leu Lys His IleVal Trp Ala Ser Arg Glu Leu 20 25 30 Glu Arg Phe Ala Leu Asn Pro Gly LeuLeu Glu Thr Ser Glu Gly Cys 35 40 45 Lys Gln Ile Met Lys Gln Leu Gln ProAla Leu Gln Thr Gly Thr Glu 50 55 60 Glu Leu Lys Ser Leu Tyr Asn Thr ValAla Thr Leu Tyr Cys Val His 65 70 75 80 Glu Lys Ile Glu Val Arg Asp ThrLys Glu Ala Leu Asp Lys Ile Glu 85 90 95 Glu Glu Gln Asn Lys Cys Gln GlnLys Thr Gln Gln Ala Lys Ala Ala 100 105 110 Asp Gly Lys Val Ser Gln AsnTyr Pro Ile Val Gln Asn Leu Gln Gly 115 120 125 Gln Met Val His Gln AlaIle Ser Pro Arg Thr Leu Asn Ala Trp Val 130 135 140 Lys Val Ile Glu GluLys Ala Phe Ser Pro Glu Val Ile Pro Met Phe 145 150 155 160 Thr Ala LeuSer Glu Gly Ala Thr Pro Gln Asp Leu Asn Thr Met Leu 165 170 175 Asn ThrVal Gly Gly His Gln Ala Ala Met Gln Met Leu Lys Asp Thr 180 185 190 IleAsn Glu Glu Ala Ala Glu Trp Asp Arg Val His Pro Val His Ala 195 200 205Gly Pro Ile Ala Pro Gly Gln Met Arg Glu Pro Arg Gly Ser Asp Ile 210 215220 Ala Gly Thr Thr Ser Thr Leu Gln Glu Gln Ile Ala Trp Met Thr Ser 225230 235 240 Asn Pro Pro Ile Pro Val Gly Asp Ile Tyr Lys Arg Trp Ile IleLeu 245 250 255 Gly Leu Asn Lys Ile Val Arg Met Tyr Ser Pro Val Ser IleLeu Asp 260 265 270 Ile Arg Gln Gly Pro Lys Glu Pro Phe Arg Asp Tyr ValAsp Arg Phe 275 280 285 Phe Lys Thr Leu Arg Ala Glu Gln Ala Thr Gln GluVal Lys Asn Trp 290 295 300 Met Thr Asp 305 31 278 PRT Humanimmundeficiency virus type 1 31 Pro Leu Thr Glu Glu Lys Ile Lys Ala LeuThr Ala Ile Cys Glu Glu 1 5 10 15 Met Glu Lys Glu Gly Lys Ile Thr LysIle Gly Pro Glu Asn Pro Tyr 20 25 30 Asn Thr Pro Ile Phe Ala Ile Lys LysLys Asp Ser Thr Lys Trp Arg 35 40 45 Lys Leu Val Asp Phe Arg Glu Leu AsnLys Arg Thr Gln Asp Phe Trp 50 55 60 Glu Val Gln Leu Gly Ile Pro His ProAla Gly Leu Lys Lys Lys Lys 65 70 75 80 Ser Val Thr Val Leu Asp Val GlyAsp Ala Tyr Phe Ser Val Pro Leu 85 90 95 Asp Glu Gly Phe Arg Lys Tyr ThrAla Phe Thr Ile Pro Ser Ile Asn 100 105 110 Asn Glu Thr Pro Gly Ile ArgTyr Gln Tyr Asn Val Leu Pro Gln Gly 115 120 125 Trp Lys Gly Ser Pro AlaIle Phe Gln Gly Ser Met Thr Lys Ile Leu 130 135 140 Glu Pro Phe Arg AlaGln Asn Pro Glu Ile Val Ile Tyr Gln Tyr Met 145 150 155 160 Asp Asp LeuTyr Val Gly Ser Asp Leu Glu Ile Gly Gln His Arg Ala 165 170 175 Lys IleGlu Glu Leu Arg Glu His Leu Leu Lys Trp Gly Phe Thr Thr 180 185 190 ProAsp Lys Lys His Gln Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr 195 200 205Glu Leu His Pro Asp Lys Trp Thr Val Gln Pro Ile Gln Leu Pro Glu 210 215220 Lys Asp Ser Trp Thr Val Asn Asp Ile Gln Lys Leu Val Gly Lys Leu 225230 235 240 Asn Trp Ala Ser Gln Ile Tyr Pro Gly Ile Lys Val Arg Gln LeuCys 245 250 255 Lys Leu Leu Arg Gly Thr Lys Ala Leu Thr Asp Ile Val ProLeu Thr 260 265 270 Glu Glu Ala Glu Leu Glu 275 32 335 PRT Humanimmunodeficiency virus type 1 32 Tyr Cys Ala Pro Ala Gly Tyr Ala Ile LeuLys Cys Asn Asn Lys Thr 1 5 10 15 Phe Asn Gly Thr Gly Pro Cys Asn AsnVal Ser Thr Val Gln Cys Thr 20 25 30 His Gly Ile Met Pro Val Val Ser ThrGln Leu Leu Leu Asn Gly Ser 35 40 45 Leu Ala Glu Glu Glu Ile Ile Ile ArgSer Glu Asn Leu Thr Asn Asn 50 55 60 Ile Lys Thr Ile Ile Val His Leu AsnLys Ser Val Glu Ile Val Cys 65 70 75 80 Thr Arg Pro Asn Asn Asn Thr ArgLys Ser Ile Arg Ile Gly Pro Gly 85 90 95 Gln Thr Phe Tyr Ala Thr Gly GluIle Ile Gly Asn Ile Arg Glu Ala 100 105 110 His Cys Asn Ile Ser Lys SerAsn Trp Thr Ser Thr Leu Glu Gln Val 115 120 125 Lys Lys Lys Leu Lys GluHis Tyr Asn Lys Thr Ile Glu Phe Asn Pro 130 135 140 Pro Ser Gly Gly AspLeu Glu Val Thr Thr His Ser Phe Asn Cys Arg 145 150 155 160 Gly Glu PhePhe Tyr Cys Asn Thr Thr Lys Leu Phe Ser Asn Asn Ser 165 170 175 Asp SerAsn Asn Glu Thr Ile Thr Leu Pro Cys Lys Ile Lys Gln Ile 180 185 190 IleAsn Met Trp Gln Lys Val Gly Arg Ala Met Tyr Ala Pro Pro Ile 195 200 205Glu Gly Asn Ile Thr Cys Lys Ser Asn Ile Thr Gly Leu Leu Leu Thr 210 215220 Arg Asp Gly Gly Lys Asn Thr Thr Asn Glu Ile Phe Arg Pro Gly Gly 225230 235 240 Gly Asn Met Lys Asp Asn Trp Arg Ser Glu Leu Tyr Lys Tyr LysVal 245 250 255 Val Glu Ile Glu Pro Leu Gly Val Ala Pro Thr Lys Ser LysArg Arg 260 265 270 Val Val Glu Arg Glu Lys Arg Ala Val Gly Leu Gly AlaVal Leu Leu 275 280 285 Gly Phe Leu Gly Ala Ala Gly Ser Thr Met Gly AlaAla Ser Ile Thr 290 295 300 Leu Thr Val Gln Ala Arg Gln Leu Leu Ser GlyIle Val Gln Gln Gln 305 310 315 320 Ser Asn Leu Leu Arg Ala Ile Glu AlaGln Gln His Met Leu Gln 325 330 335

1. A process for the selection of HIV subtype isolates for use in thedevelopment of a prophylactic and/or therapeutic pharmaceuticalcomposition comprising the following steps: isolating viruses fromrecently infected subjects; generating a consensus sequence for at leastpart of at least one HIV gene by identifying the most common codon oramino acid among the isolated viruses at each position along at leastpart of the gene; selecting the isolated virus or viruses with a highsequence identity to the consensus sequence, a phenotype which isassociated with transmission for the particular HIV subtype.
 2. Aprocess according to claim 1, wherein the isolated virus is of the samesubtype as a likely challenge strain.
 3. A process according to eitherof claims 1 or 2, wherein the HIV subtype is HIV-1 subtype C.
 4. Aprocess according to claim 3, wherein the phenotype which is associatedwith transmission is a virus that utilises the CCR5 co-receptor and isnon syncitium inducing (NSI).
 5. An HIV-1 subtype C isolate, designatedDu422 and assigned Provisional Accession Number 01032114 by the EuropeanCollection of Cell Cultures.
 6. An HIV-1 subtype C isolate, designatedDu151and assigned Accession Number. 00072724 by the European Collectionof Cell Cultures.
 7. An HIV-1 subtype C isolate, designated Du179 andassigned Accession Number. 00072725 by the European Collection of CellCultures.
 8. A molecule having: (i) the nucleotide sequence set out inSequence I.D. No 1; (ii) an RNA sequence corresponding to the nucleotidesequence set out in Sequence I.D. No 1; (iii) a sequence which willhybridise to the nucleotide sequence set out in Sequence I.D. No 1 or anRNA sequence corresponding to it, under strict hybridisation conditions;(iv) a sequence which is homologous to the nucleotide sequence set outin Sequence I.D. No 1 or an RNA sequence corresponding to it; or (v) asequence which is a modification or derivative of the sequence of anyone of (i) to (iv).
 9. A molecule according to claim 8, which has themodified sequence set out in Sequence I.D. No
 7. 10. A molecule having:(i) the nucleotide sequence set out in Sequence I.D. No 3; (ii) an RNAsequence corresponding to the nucleotide sequence set out in SequenceI.D. No 3; (iii) a sequence which will hybridise to the nucleotidesequence set out in Sequence I.D. No 3 or an RNA sequence correspondingto it, under strict hybridisation conditions; (iv) a sequence which ishomologous to the nucleotide sequence set out in Sequence I.D. No. 3 oran RNA sequence corresponding to it; or (v) a sequence which is amodification or derivative of the sequence of any one of (i) to (iv).11. A molecule according to claim 10, which has the modified sequenceset out in Sequence I.D. No.
 9. 12. A molecule having: (i) thenucleotide sequence set out in Sequence I.D. No. 5; (ii) an RNA sequencecorresponding to the nucleotide sequence set out in Sequence I.D. No. 5;(iii) a sequence which will hybridise to the nucleotide sequence set outin Sequence I.D. No. 5 or an RNA sequence corresponding to it, understrict hybridisation conditions; (iv) a sequence which is homologous tothe nucleotide sequence set out in Sequence I.D. No. 5 or an RNAsequence corresponding to it; or (v) a sequence which is a modificationor derivative of the sequence of any one of (i) to (iv).
 13. A moleculeaccording to claim 12, which has the modified sequence set out inSequence I.D. No.
 11. 14. A molecule having: (i) the nucleotide sequenceset out in Sequence I.D. No. 13; (ii) an RNA sequence corresponding tothe nucleotide sequence set out in Sequence I.D. No. 13; (iii) asequence which will hybridise to the nucleotide sequence set out inSequence I.D. No. 13 or an RNA sequence corresponding to it, understrict hybridisation conditions; (iv) a sequence which is homologous tothe nucleotide sequence set out in Sequence I.D. No. 13 or an RNAsequence corresponding to it; or (v) a sequence which is a modificationor derivative of the sequence of any one of (i) to (iv).
 15. A moleculeaccording to claim 14, which has a modified sequence which has similaror the same modifications as those set out in Sequence I.D. No. 11 forthe env gene of the isolate Du151.
 16. A polypeptide having: (i) theamino acid sequence set out in Sequence I.D. No. 2; or (ii) a sequencewhich is a modification or derivative of the amino acid sequence set outin Sequence I.D. No.
 2. 17. A polypeptide according to claim 16, whereinthe modified sequence is set out in Sequence I.D. No.
 8. 18. Apolypeptide having: (i) the amino acid sequence set out in Sequence I.D.No. 4; or (ii) a sequence which is a modification or derivative of theamino acid sequence set out in Sequence I.D. No.
 4. 19. A polypeptideaccording to claim 18, wherein the modified sequence is that set out inSequence I.D. No.
 10. 20. A polypeptide having: (i) the amino acidsequence set out in Sequence I.D. No. 6; or (ii) a sequence which is amodification or derivative of the amino acid sequence set out inSequence I.D. No.
 6. 21. A polypeptide according to claim 20, whereinthe modified sequence is that set out in Sequence I.D. No.
 12. 22. Apolypeptide having: (i) the amino acid sequence set out in Sequence I.D.No. 14; (ii) a sequence which is a modification or derivative of theamino acid sequence set out in Sequence I.D. No.
 14. 23. A polypeptideaccording to claim 22, wherein the modified sequence has similar or thesame modifications as those set out in Sequence I.D. No. 12 for theamino acid sequence of the env gene of the isolate Du151.
 24. Aconsensus amino acid sequence for the partial gag gene of HIV-1 subtypeC which is: GEKLDKWEKI RLRPGGKKHY MLKHLVWASR ELERFALNPG LLETSEGCKQ⁵⁰IMKQLQPALQ TGTEELRSLY NTVATLYCVH EKIEVRDTKE ALDKIEEEQN¹⁰⁰ KSQQ-CQQKTQQAKAADGG- KVSQNYPIVQ NLQGQMVHQA ISPRTLNAWV¹⁵⁰ EEKAFSP    EVIPMFTALSEGATPQDLNT MLNTVGGHQA AMQMLKDTIN²⁰⁰ EEAAEWDRLH PVHAGPIAPG QMREPRGSDIAGTTSTLQEQ IAWMTSNPPI²⁵⁰ PVGDIYKRWI ILGLNKIVRM YSPVSILDIK QGPKEPFRDYVDRFFKTLRA³⁰⁰ EQATQDVKNW MTD³¹³


25. A consensus amino acid sequence for the partial pol gene of HIV-1subtype C which is: LTEEKIKALT AICEEMEKEG KITKIGPENP YNTPVFAIKKKDSTKWRKL-⁵⁰ VDFRELNKRT QDFWEVQLGI PHPAGLKKKK SVTVLDVGDA YFSVPLDEGF¹⁰⁰RKYTAFTIPS INNETPGIRY QYNVLPQGWK GSPAIFQSSM TKILEPFRAK¹⁵⁰ NPEIVIYQYMDDLYVGSDLE IGQHRAKIEE LREHLLKWGF TTPDKKHQKE²⁰⁰ PPFLWMGYEL HPDKWTVQPIQLPEKDSWTV NDIQKLVGKL NWASQIYPGI²⁵⁰ KVRQLCKLLR GAKALTDIVP LTEEAELE²⁷⁸


26. A consensus amino acid sequence for the partial env gene of HIV-1subtype C which is: YCAPAGYAIL KCNNKTFNGT GPCNNVSTVQ CTHGIKPVVSTQLLLNGSLA⁵⁰ EEEIIIRSEN LTNNAKTIIV HLNESVEIVC TRPNNNTRKS IRIGPGOTFY¹⁰⁰ATGDIIGDIR QAHCNISEGK WNKTLQKVKK KLKEELYKYK VVEIKPLGIA¹⁵⁰ PTEAKRRVVEREKRAVGIGA VFLGFLGAAG STMGAASITL TVQARQLLSG²⁰⁰ IVQQQSNLLR AIEAQQHMLQLTVWGIKQL²²⁹


27. A process according to claim 1, substantantially as hereindescribed.
 28. An HIV-1 subtype C isolate according to claim 5,substantially as herein described.
 29. An HIV-1 subtype C isolateaccording to claim 6, substantially as herein described.
 30. An HIV-1subtype C isolate according to claim 7, substantially as hereindescribed.
 31. A molecule according to claim 8, substantially as hereindescribed.
 32. A molecule according to claim 10, substantially as hereindescribed.
 33. A molecule according to claim 12, substantially as hereindescribed.
 34. A molecule according to claim 14, substantially as hereindescribed.
 35. A polypeptide according to claim 16, substantially asherein described.
 36. A polypeptide according to claim 18, substantiallyas herein described.
 37. A polypeptide according to claim 20,substantially as herein described.
 38. A polypeptide according to claim22, substantially as herein described.
 39. A consensus amino acidsequence according to claim 24, substantially as herein described.
 40. Aconsensus amino acid sequence according to claim 25, substantially asherein described.
 41. A consensus amino acid sequence according to claim26, substantially as herein described.